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1 

Computer  performance  measurement  and  evaluation  (CPME)  studies 
are  conducted  for  the  purpose  of  sizing  and  selecting  a  new  system 
(selection  studies);  during  the  design  phase  of  either  a  new  system  or 
a  hardware/software  modification  to  an  existing  system  to  assess  the 
impact  of  the  new  system/modification  (performance  projection  studies); 
or  to  assess  and  improve  the  level  of  performance  of  an  existing 
system  (performance  monitoring  studies).  Nearly  all  performance  mea¬ 
sures  used  are  related  to  the  workload  being  processed  by  the  system. 
There  is  the  need  fv  a  workload  which  emulates  the  actual  workload, 
yet  executes  in  less  time  and  does  not  compromise  the  adequacy  of 
the  measurements.  Such  a  workload  is  called  a  drive  or  test  workload. 

A  statistical  methodology  is  proposed  to  aid  in  the  construction 

of  a  test  workload.  The  major  elements  of  this  methodology  are 

K- 

(a)  selecting  t^e  workload  subset  by  constructing  an  overall 
workload  profile  and  th^{i  choosing  a  period  which  exhibits  character¬ 
istics  pertinent  to  the  evaluation  study. 


(b)  choosing  a  set  of  descriptor  variables  which  is  detailed 
enough  to  represent  the  demand  placed  upon  the  major  system  resources, 
but  is  not  so  detailed  as  to  complicate  later  stages  of  the  analysis, 

(c)  collecting  data  reflecting  the  values  of  the  descriptor 
variables  for  the  worksteps  in  the  selected  subset, 

(d)  scaling  the  resource  demand  matrix  so  that  each  descriptor 
has  mean  0  and  variance  1 , 

(e)  applying  principal  components  analysis  to  the  scaled  resource 
demand  matrix  and  retaining  only  those  components  needed  to  explain  the 
major  part  of  the  variability  in  the  data, 

(f)  clustering  the  transformed  resource  demand  vectors  in  the 
principal  components  space  using  a  non-hierarchical  clustering  algo¬ 
rithm  with  a  weighted  Euclidean  distance  measure, 

(g)  designing  synthetic  jobs  for  each  of  the  isolated  clusters 
using  regression  analysis  to  obtain  predictor  equations  for  the  param¬ 
eter  settings, 

(h)  forming  a  synthetic  job  mix  by  combining  a  sufficient  number 
of  copies  of  the  various  synthetic  jobs  with  appropriate  parameter 
settings  and  the  desired  arrival  time  of  each,  and 

(i)  validating  the  generated  synthetic  job  mix  by  executing  it 
on  the  system  being  studied,  comparing  its  resource  demand  character¬ 
istics  with  those  of  the  real  subset,  and  adjusting  the  parameter  set¬ 
tings  as  necessary. 

A  detailed  case  study  of  the  workload  processed  by  the  Amdahl 
470/V6  at  Texas  A&M  University  is  presented  Illustrating  many  of  the 
proposed  techniques.  Suggestions  for  further  work  are  included. 
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A  Statistical  Methodology  for 
Constructing  Synthetic  Test  Workloads  (May  1979) 

Wayne  Thomas  Graybeal ,  B.S.,  University  of  Oklahoma 
M.A.,  University  of  Arizona 
Chairman  of  Advisory  Committee:  Dr.  Udo  W.  Pooch 

Computer  performance  measurement  and  evaluation  (CPME)  studies 
are  conducted  for  the  purpose  of  sizing  and  selecting  a  new  system 
(selection  studies);  during  the  design  phase  of  either  a  new  system  or 
a  hardware/software  modification  to  an  existing  system  to  assess  the 
impact  of  the  new  system/modification  (performance  projection  studies); 
or  to  assess  and  improve  the  level  of  performance  of  an  existing 
system  (performance  monitoring  studies).  Nearly  all  performance  mea¬ 
sures  used  are  related  to  the  workload  being  processed  by  the  system. 
There  is  the  need  for  a  workload  which  emulates  the  actual  workload, 
yet  executes  in  less  time  and  does  not  compromise  the  adequacy  of 
the  measurements.  Such  a  workload  is  called  a  drive  or  test  workload. 

A  statistical  methodology  is  proposed  to  aid  in  the  construction 
of  a  test  workload.  The  major  elements  of  this  methodology  are 

(a)  selecting  the  workload  subset  by  constructing  an  overall 
workload  profile  and  then  choosing  a  period  which  exhibits  character¬ 
istics  pertinent  to  the  evaluation  study. 


(b)  choosing  a  set  of  descriptor  variables  which  is  detailed 
enough  to  represent  the  demand  placed  upon  the  major  system  resources, 
but  is  not  so  detailed  as  to  complicate  later  stages  of  the  analysis, 

(c)  collecting  data  reflecting  the  values  of  the  descriptor 
variables  for  the  worksteps  in  the  selected  subset, 

(d)  scaling  the  resource  demand  matrix  so  that  each  descriptor 
has  mean  0  and  variance  1 , 

(e)  applying  principal  components  analysis  to  the  scaled  resource 
demand  matrix  and  retaining  only  those  components  needed  to  explain  the 
major  part  of  the  variability  in  the  data, 

(f)  clustering  the  transformed  resource  demand  vectors  in  the 
principal  components  space  using  a  non-hierarchical  clustering  algo¬ 
rithm  with  a  weighted  Euclidean  distance  measure, 

(g)  designing  synthetic  jobs  for  each  of  the  isolated  clusters 
using  regression  analysis  to  obtain  predictor  equations  for  the  param¬ 
eter  settings, 

(h)  forming  a  synthetic  job  mix  by  combining  a  sufficient  number 
of  copies  of  the  various  synthetic  jobs  with  appropriate  parameter 
settings  and  the  desired  arrival  time  of  each,  and 

(i)  validating  the  generated  synthetic  job  mix  by  executing  it 
on  the  system  being  studied,  comparing  its  resource  demand  character¬ 
istics  with  those  of  the  real  subset,  and  adjusting  the  parameter  set¬ 
tings  as  necessary. 

A  detailed  case  study  of  the  workload  processed  by  the  Amdahl 
470/V6  at  Texas  A&M  University  is  presented  illustrating  many  of  the 
proposed  techniques.  Suggestions  for  further  work  are  included. 
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CHAPTER  I 
INTRODUCTION 


1 . 1  Background 

The  development  of  the  electronic  digital  computer,  begun  in  the 
late  1 940 ’ s  and  continuing  until  the  present  time,  has  had  a  dramatic 
effect  on  nearly  every  field  of  human  endeavor.  Rapid  advances  in  both 
hardware  and  software  have  exceeded  even  the  most  ambitious  projections. 
With  each  advance  in  hardware  and/or  software  came  another  level  of 
complexity.  This  led  to  the  ultra-fast,  highly  sophisticated  systems 
of  today  in  which  the  synergistic  effects  of  their  combined  hardware 
and  software  subsystems  can  yield  performance  which  is  surprising  even 
to  the  system  designer.  It  has  been  suggested  [78]  that  these  systems 
are  too  complicated  for  the  problems  they  are  intended  to  solve,  and 
that  their  complexity  makes  them  inherently  inefficient.  The  degree  of 
truth  in  these  suggestions  may  be  debated,  however  it  is  apparent  that 
the  computer  has  evolved  into  one  of  the  most  complicated  systems 
yet  devised  by  man. 


The  Communications  of  the  Association  for  Computing  Machinery  i s 
used  as  a  pattern  for  format  and  style. 
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Computer  Performance  Measurement  and  Evaluation  (CPME)  is  a  term 
coined  to  refer  to  a  loosely-defined  branch  of  computer  science.  It 
has  evolved  to  satisfy  the  need  for  understanding  and  predicting  the 
performance  of  computer  systems.  As  the  name  implies,  there  are  two 
different  aspects  to  the  study  of  the  performance  of  a  computer  system. 
The  first  is  measurement,  the  act  of  ascertaining  the  extent  of  the 
performance.  The  second  is  evaluation,  the  act  of  examining  or  judging 
the  value  of  performance  [29].  This  field  is  not  a  new  one  [20],  how¬ 
ever,  recent  technical  advances  in  hardware  and  a  rethinking  of  the 
problem  have  led  to  a  broadening  of  scope.  While  the  early  researchers 
were  concerned  with  only  the  performance  of  the  hardware  [64],  the 
performance  of  a  given  computer  system  has  been  realized  to  be  a 
function  of  the  total  hardware/software  package.  Thus,  such  seemingly 
unrelated  areas  as  program  behavior  [85],  computational  complexity  [6] 
and  software  engineering  [38]  have  been  recognized  as  having  an  impact 
on  the  performance  of  computer  systems. 

1.2  Types  of  Evaluation  Studies 

The  development,  acquisition  and  maintenance  of  a  computer 
system  is  an  expensive  proposition.  Unfortunately,  an  efficient  and 
an  effective  system  appears  to  be  the  exception  rather  than  the  rule 
[56].  Thus,  there  is  a  continuing  interest,  both  on  the  part  of  man¬ 
agement  as  well  as  system  analysts,  in  the  understanding  and  in  the 
improving  of  the  performance  of  computer  systems. 

Lucas  [64]  classified  evaluation  studies  by  the  reasons  for  which 


they  are  conducted.  Selection  evaluation  studies  are  conducted  for  the 
purpose  of  sizing  and  selecting  a  new  system.  This  type  of  evaluation 
assumes  that  the  relative  performance  in  accomplishing  a  certain  task  is 
a  factor  in  choosing  one  system  over  another  system.  Performance 
projection  studies,  on  the  other  hand,  are  conducted  during  the  design 
phase  of  either  a  new  system  or  a  hardware/software  modification  to  an 
existing  system.  The  aim  of  such  a  study  is  to  assess  the  impact  that 
certain  features  of  the  new  system  or  subsystem  will  have  on  the 
system's  performance.  Such  an  evaluation  is  handicapped  in  most  cases 
by  the  lack  of  a  prototype.  The  results  obtained  are  therefore 
largely  theoretical  and  subject  to  validation  once  the  system  or 
subsystem  design  is  implemented.  The  third  type  of  evaluation  is 
termed  performance  monitoring.  This  type  of  study  has  as  its  aim  the 
assessment  and  improvement  of  the  level  of  performance  of  current 
systems.  Results  of  this  type  of  evaluation  can  be  used  to  "tune"  a 
system,  thus  attaining  a  higher  level  of  efficiency;  to  establish  a 
profile  of  system  activity  In  order  to  apply  priority  algorithms  and 
establish  billing  procedures;  or  to  forecast  the  Impact  of  a  proposed 
change  in  either  the  system  or  the  workload. 

There  appears  to  be  a  degree  of  commonality  in  both  purpose  and 
technique  in  the  classifications  proposed  by  Lucas[64].  On  the  other 
hand,  a  more  meaningful  classification  might  be  one  proposed  by 
Svobodova  [88].  A  study  which  is  conducted  to  assess  the  performance 
of  one  system  relative  to  another  is  called  a  comparative  evaluation. 
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A  study  that  is  conducted  to  evaluate  the  system's  performance  relative 
to  system  parameters  and/or  system  workload  is  termed  an  analytical 
evaluation. 

1 . 3  Evaluation  Techniques 

Three  general  techniques  have  emerged  in  the  evaluation  of 
computer  systems:  analytical,  simulation,  and  empirical  [39],  The 
technique  to  use  is  affected  by  such  factors  as  why  the  study  is 
being  conducted,  the  level  of  detail  needed  in  the  study,  and  the 
availability  of  the  system  being  studied. 

Analytical  techniques  are  characterized  by  the  representation  of 
the  system  in  the  form  of  a  mathematical  model,  and  the  solution  tech¬ 
niques  using  ordinary  mathematical  means.  Probably  the  most  common 
mathematical  model  of  a  computer  system  is  that  of  a  queuing  system 
[10,17,19,21,27,37,77].  In  this  representation,  the  system  or  sub¬ 
system  being  studied  is  considered  as  a  service  facility.  Jobs  or 
tasks  are  considered  as  customers  arriving  to  the  service  facility 
requiring  some  quantity  of  service  [40,58].  There  are  a  number  of 
disadvantages  to  using  analytical  techniques  in  an  evaluation  study. 
First,  a  mathematical  model  which  is  detailed  enough  to  accurately 
represent  today's  highly  complex  computer  system  is  likely  to  be 
mathematically  intractable.  Second,  in  an  effort  to  make  the  model 
solvable,  the  researcher  may  be  required  to  make  a  number  of  assump¬ 
tions.  For  example,  if  a  queuing  model  is  used,  it  is  common  to  assume 
that  the  interarrival  times  are  independent  and  that  the  system  has 
achieved  a  stochastic  balance  (steady  state)  [40,88].  The  validity 
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of  such  assumptions  can  certainly  be  questioned  in  many  studies. 

A  third  disadvantage  is  that  even  If  the  system  is  accurately  represen¬ 
ted  and  the  assumptions  deemed  valid,  the  researcher  has  the  problem 
of  estimating  system  parameters.  For  these  reasons,  analytical  tech¬ 
niques  have  found  little  utility  in  full-scale  performance  evaluation 
studies.  They  have,  however,  been  used  in  studies  involving  subsystems 
or  particular  aspects  of  a  system's  behavior  such  as  CPU  scheduling 
[57,68],  and  the  management  of  I/O  channels  [34,84]. 

The  second  general  technique  used  in  evaluation  studies  is  simula¬ 
tion.  In  this  technique,  the  structure  of  the  system  is  reflected  in 
a  computer  program.  The  behavior  of  the  system  under  particular  condi¬ 
tions  can  then  be  studied  by  varying  the  parameters  of  the  simulator.  This 
technique  avoids  the  problem  of  intractability  encountered  in  analytic 
methods,  and  generally  does  not  require  the  researcher  to  make  as  many 
assumptions.  There  are,  however,  problems  with  this  technique  as  well. 
If  a  high  degree  of  detail  is  required  in  the  system  model,  the  simula¬ 
tor  can  become  quite  expensive  to  develop  and  to  use.  Furthermore, 
to  be  a  useful  tool,  the  simulator  must  be  validated.  That  is,  it  must 
be  demonstrated  that  the  simulator  behaves  in  the  same  manner  as  the 
real  system  when  presented  with  identical  conditions.  Often  this  aspect 
of  the  simulation  study  is  neglected  [39],  which  leads  to  questionable 
interpretation  of  any  results.  There  are  many  examples  in  the  litera¬ 
ture  [60,61]  of  full  scale  system  simulations. 

The  third  general  category  of  evaluation  techniques  involves 

studies  made  through  the  observation  of  some  real  system  (empirical 
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analysis) .  This  generally  entails  the  collection  and  analysis  of  data 
reflecting  the  system's  performance.  Much  data  is  collected  through 
accounting  logs  and  other  means  at  every  computer  installation.  It 
has  only  been  recently  that  a  serious  attempt  has  been  made  at  analy¬ 
zing  this  data.  Empirical  techniques,  aside  from  their  utility  in 
conducting  separate  evaluation  studies,  also  provide  a  means  of  vali¬ 
dating  results  obtained  from  an  analytical  or  simulation  study  [39]. 
Problems  encountered  in  using  empirical  techniques  include  the  unavail¬ 
ability  of  the  system  and  the  degradation  of  system  performance  because 
of  the  monitoring  orocess. 

1 . 4  Performance  Measures 

In  the-  past,  the  relative  performance  capability  of  a  computer 
system  was  judged  by  such  hardware  characteristics  as  CPU  cycle  time, 
memory  access  time  and  the  time  needed  to  execute  particular  operations 
(i.e.  add)  [64,68].  It  was  thought  that  the  shorter  these  times  were, 
the  more  "powerful"  the  system  was  and  hence  the  higher  its  performance 
rating.  In  later  years,  especially  with  multi  programmed  systems,  it  has 
become  apparent  that,  although  important,  these  measures  are  generally 
inadequate  in  characterizing  the  performance  of  a  given  system.  Many 
other  "performance  measures"  have  been  developed  and  are  considered  to  be 
more  useful  in  assessing  performance.  Someof  the  more  popular  of  these 
measures  are  detailed  below.  For  a  more  complete  list,  see  Svobodova  [88]. 

One  of  the  more  common  measures  of  the  performance  of  a  computer 
system  is  throughput.  Throughput  is  defined  to  be  the  amount  of  useful 
work  completed  per  unit  time  when  executing  a  given  workload  [9,56,88]. 
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Since  throughput  is  generally  used  in  comparative  evaluation  studies,  a 
related  measure,  relative  throughput,  has  been  developed.  The  relative 
throughput  is  defined  as  the  ratio  of  the  elapsed  time  required  to 
process  a  given  workload  on  one  system  versus  the  time  required  to 
process  the  same  identical  workload  on  another  system  [29,41,88].  Still 
another  related  measure  is  the  throughput  rate,  defined  to  be  the  aver¬ 
age  number  of  task  completions  per  unit  time  [75]. 

With  the  advent  of  mul ti programmed  systems,  a  number  of  measures 
were  developed  to  assess  the  performance  of  these  systems  relative  to 
monoprogrammed  systems.  One  of  these  is  the  Elapsed  Time  Multiprogram¬ 
ming  Factor  (ETMF)  which  is  defined  [82,88]  as  the  ratio  of  the  turn¬ 
around  time  of  a  job  in  a  multi  programmed  environment  to  the  turnaround 
time  when  it  is  the  only  job  in  the  system.  Another  related  measure  is 
the  gain  factor  [88]  which  is  the  total  system  time  needed  to  execute  a 
set  of  jobs  in  a  multi  programmed  environment  to  the  total  system  time 
needed  to  execute  the  same  set  of  jobs  serially.  Still  another  measure 
related  to  multiprogramming  is  the  internal  delay  time  [88],  which  is 
the  ratio  of  processing  time  of  a  job  in  a  multiprogramming  environment 
to  the  time  required  when  it  is  the  only  job  in  the  system. 

Other  advances  in  software  and  hardware  necessitated  more  measures 

of  a  system's  performance.  For  example,  virtual  memory  systems  neces¬ 
sitated  a  measure  of  the  behavior  of  page  and  segment  replacement  rules. 
Page  (segment)  fault  rate  [23,88]  is  the  most  frequently  used  measure  of 
this  performance. 
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1.5  Means  of  Measurement 

An  empirical  performance  evaluation  study  requires  that  data  be 
collected  on  the  system  activity.  There  are  a  number  of  ways  this  data 
can  be  collected.  The  simplest  way  [75]  is  through  a  simple  observation 
of  the  system  utilizing  the  system  console  and  the  behavior  of  I/O 
units  as  an  indication  of  system  performance.  The  type  of  information 
which  could  be  gained  through  this  type  of  observation  would  appear  to 
be  severely  limited.  Another  source  of  information  on  the  behavior  of 
the  system  is  from  the  system  accounting  logs  .  These  logs  can  be  used  to 
obtain  information  on  resource  utilization  at  a  job  or  job-step  level. 

A  third  source  is  utilizing  a  monitor.  There  are  three  general  types 
of  monitors  available:  hardware,  software  and  hybrid. 

A  hardware  monitor  [15,75]  is  logically  and  physically  distinct 
from  the  system  being  monitored.  System  activity  is  routed  to  the 
monitor  through  a  series  of  probes.  For  instance,  the  period  of  time 
a  processor  spends  in  the  WAIT  state  could  be  monitored  by  installing 
a  probe  on  the  line  leading  to  the  WAIT  light  on  the  system  console. 
Hardware  monitors,  though  they  can  be  used  to  measure  essentially 
any  event,  are  limited  in  that  they  cannot  give  an  indication  as  to 
the  cause  of  the  event. 

An  alternative  to  the  hardware  monitor  is  the  software  monitor. 
Software  monitors  are  programs  which  reside  on  the  system  being  moni¬ 
tored.  There  are  two  general  types  of  software  monitors.  The  first, 
the  interrupt-intercept  monitor  [75]  is  activated  whenever  an  event 
which  causes  an  interrupt  occurs.  Rather  than  control  being  passed 


■  k 


9 


directly  to  the  interrupt  handler,  it  is  instead  routed  to  the  monitor 
which  records  the  system  state  and  then  passes  control  to  the  appropri¬ 
ate  interrupt  handler.  The  second  type,  the  sampling  monitor,  is  acti¬ 
vated  at  certain  time  intervals,  at  which  time  it  records  the  system 
state.  Regardless  of  which  type  of  software  monitor  is  used,  there  is 
a  serious  drawback.  That  is  since  the  monitor  is  resident  in  the  host 
system,  it  competes  for  system  resources  along  with  normal  jobs.  Thus, 
the  use  of  a  software  monitor  can  degrade  system  performance  through 
the  introduction  of  additional  system  overhead.  This  degradation  has 
been  termed  the  "artifact"  of  using  a  software  monitor.  This  artifact 
can  be  a  serious  problem  in  evaluation  studies,  since  the  results 
obtained  on  system  activity  are  biased  to  some  degree. 

In  an  effort  to  minimize  the  disadvantages  of  pure  hardware  and 
software  monitors,  the  hybrid  monitor  has  been  developed.  The  hybrid 
monitor  is  essentially  a  combination  of  the  two  previous  approaches. 

A  minicomputer  is  normally  attached  as  an  "intelligent"  terminal  to  the 
host  computer.  Hardware  probes  are  used  to  detect  event  occurrences, 
just  as  in  the  pure  hardware  approach.  In  addition,  the  hybrid  monitor 
has  the  ability  to  interrupt  the  host  system  and  cause  status  informa¬ 
tion  to  be  sent  to  it.  Thus,  a  hybrid  monitor  can  link  event  occur¬ 
rences  to  their  causes,  which  pure  hardware  monitors  cannot.  Further, 
since  the  required  software  support  within  the  host  system  is  limited, 
the  software  monitor  artifact  is  reduced.  This  approach  to  monitoring 
appears  to  be  the  most  promising. 
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1.6  Influence  of  Workload  on  System  Performance 

Nearly  all  of  the  performance  measures  mentioned  earlier  related 
the  performance  of  a  system  to  a  particular  workload.  It  has  long  been 
recognized  [13]  that  the  choice  of  the  workload  will  have  a  major 
impact  on  the  observed  performance.  For  example,  if  one  of  the  aspects 
of  system  performance  that  is  being  studied  is  the  percent  of  channel 
utilization,  an  I/O  bound  workload  would  provide  entirely  different 
results  from  that  of  a  compute-bound  workload. 

The  workload  (jobload)  of  a  computer  system  is  defined  [75]  as  the 
set  of  all  programs,  data,  and  commands  that  are  submitted  to  the 
system  for  subsequent  execution.  Since  a  workload  has  such  a  dramatic 
effect  on  the  performance  of  a  given  computer  system,  the  problem  of 
how  to  represent  or  characterize  the  workload  has  arisen  in  practically 
every  computer  system  evaluation  study  undertaken  [32],  In  many  cases, 
workload  characterization  is  the  hardest  technical  problem  to  solve 
for  the  investigator  [32].  There  are  many  reasons  for  this,  the  chief 
one  being  the  nonrecurrent  nature  of  a  computer  workload.  That  is,  if 
a  system  is  handling  a  repetitive  workload  in  which  the  same  set  of 
requests  are  made  cyclically,  then  the  workload  characterization 
problem  could  be  solved  simply  by  examining  the  set  of  requests  made  in 
one  cycle.  Unfortunately,  in  most  cases  the  workload  is  not  repetitive, 
hence  no  general  model  can  be  developed. 

Executing  the  entire  job  profile  on  each  potential  computer  system 
that  is  to  be  evaluated  can  be  expensive  and  time  consuming  .  Thus,  there 
is  the  need  for  a  workload  which  emulates  the  actual  workload,  yet 
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executes  in  less  time  and  does  not  compromise  the  adequacy  of  the 
measurements.  Such  a  workload  is  called  a  drive  or  test  workload. 

The  form  of  the  test  workload  depends  upon  the  techniques  used 
in  the  evaluation  study.  If  empirical  studies  of  the  system  are  made, 
the  test  workload  will  consist  of  an  executable  job  stream.  When 
analytical  models  of  the  system  are  used,  the  test  workload  could  be 
represented  in  the  form  of  interarrival  and  service  distributions  [23]. 
A  simulation  study  would  require  an  abbreviated  job  description  in  a 
form  compatible  with  the  simulator  that  uses  this  workload. 

1.7  Properties  of  Test  Workloads 

Regardless  of  the  form  of  the  test  workload,  there  are  a  number  of 
properties  which  the  test  workload  should  possess  to  enhance  its  use¬ 
fulness  in  an  evaluation  study.  Ferrari  [32]  lists  eight  such  proper¬ 
ties.  Some  of  the  more  important  of  these  properties  are  given  below. 

Representati veness .  The  most  important  characteristic  of  a  test 
workload  is  that  it  be  representative  of  the  actual  workload.  A  test 
workload  is  representative  if  the  system's  measured  performance  when 
executing  the  test  workload  approximates  the  system's  measured  perform¬ 
ance  when  executing  the  actual  workload.  This  definition  implies  the 
existence  of  a  distance  function  or  metric  by  which  it  is  possible  to 
measure  the  relative  degree  of  representativeness  between  two  candi¬ 
date  test  workloads.  Unfortunately,  such  a  metric  does  not  exist, 
since  the  degree  of  representativeness  depends  not  only  on  the  perform¬ 
ance  measures  used,  but  also  on  the  relative  weights  assigned  to  each 
measure.  [31]. 
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Reproducibility.  Aside  from  being  representative,  a  test  workload 
must  be  reproducible.  Comparative  evaluations  as  defined  earlier  are 
designed  to  assess  the  relative  performance  of  two  or  more  systems.  If 
the  effects  of  the  different  performance  capabilities  of  the  systems 
are  to  be  isolated,  the  same  test  workload  must  be  executed  on  each 
system.  If  different  test  workloads  are  executed,  any  variation  in  the 
obtained  performance  measures  could  be  due  to  either  the  test  workload 
or  the  actual  system  differences.  A  second  reason  that  system  test 
workloads  must  be  reproducible  is  that  a  replication  of  the  basic  eval¬ 
uation  experiment  may  be  desirable.  This  repetition  allows  for  greater 
credence  in  the  results. 

Flexibil ity.  A  flexible  test  workload  is  one  that  can  be  easily 
modified.  A  researcher  may  wish  to  modify  the  test  workload  for  a 
number  of  reasons.  First,  the  actual  workload  of  a  computer  system  is 
likely  to  change  over  time.  If  the  test  workload  is  to  remain  repre¬ 
sentative,  it  must  be  changed  also.  Second,  in  establishing  the 
representativeness  of  a  test  workload,  it  maybe  necessary  to  itera¬ 
tively  adjust  the  characteristics  of  the  test  workl^d  realign  the 
properties  with  those  of  the  actual  workload.  The  ease  with  which 
these  changes  can  be  made  have  an  impact  on  the  cost  of  the  evaluation 
study,  in  terms  of  both  time  and  expended  resources. 

Portability.  A  requirement  in  comparative  evaluation  studies  is 
that  the  same  workload  be  executed  on  a  number  of  different  systems. 

A  test  workload  should  be  constructed  so  that  it  may  be  transported 
between  systems  with  a  minimum  of  effort.  Severe  modifications  to  a 
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test  workload  can  lead  to  biased  results  such  as  those  mentioned  in  the 
section  on  reproducibility. 

1 .8  Dissertation  Topic 

The  purpose  of  this  research  is  to  investigate  the  development 
of  test  workloads.  There  does  not  appear  to  exist  a  unified,  compre¬ 
hensive  methodology  which  would  allow  the  systems  analyst  to  produce 
a  concise  representative  workload  for  use  in  system  evaluation  studies, 
although  considerable  work  has  been  done  in  the  characterization  and 
representation  of  workloads.  This  research  is  designed  to  aid  in 
the  development  of  such  a  methodology. 

Major  goals  of  this  research  include: 

(a)  To  investigate  the  characterization  of  a  computer  system 
workload  at  a  gross  system  level  (daily/hourly  characteristics)  to 
aid  in  the  selection  of  interest  periods  in  a  performance  evaluation 
study. 

(b)  To  examine  the  input  job  stream  at  a  job  or  job  step  level 
with  the  aim  of  characterizing  the  pattern  of  resource  requests. 

(c)  To  investigate  the  design  of  parameterized  synthetic  jobs, 
which  can  be  used  in  the  construction  of  test  workloads. 

(d)  To  attempt  to  establish  a  step-by  step  procedure  which  can 
be  used  by  systems  personnel  in  developing  test  workloads  for  use  in 
evaluation  studies. 

(e)  To  examine  the  procedure  of  (d)  with  an  eventual  aim  of 
automating  as  much  of  the  procedure  as  appears  feasible.  Though  full 
automation  of  the  procedure  is  not  a  goal  of  this  research,  the  antici¬ 
pated  difficulties  in  this  automation  process  will  be  considered. 
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1 . 9  Dissertation  Contents 

This  dissertation  is  organized  according  to  the  three  phases 
involved  in  the  development  of  test  workloads.  These  phases  are  the 

(a)  representation  of  the  real  workload,  the 

(b)  analysis  of  the  real  workload,  and  the 

(c)  construction  of  the  test  workload. 

A  literature  review  of  the  current  state  of  the  art  is  contained 
in  Chapter  II.  The  literature  review  surveys  the  attempts  made  in  the 
past  few  years  to  solve  the  problem  of  test  workload  construction 
suitable  for  use  in  performance  evaluation  studies. 

Chapter  III  addresses  the  problem  of  representing  the  real  work¬ 
load.  Some  considerations  in  selecting  an  appropriate  subset  of  the 
real  workload,  choosing  a  set  of  descriptors  to  use  in  representing 
each  workstep,  and  collecting  data  to  obtain  real  workload  values  for 
the  descriptors  are  outlined. 

Chapter  IV  contains  a  description  of  various  statistical  tech¬ 
niques  useful  in  analyzing  the  represented  worksteps  for  similar 
resource  demand  patterns,  and  summarizing  the  often  voluminous  amounts 
of  data  in  an  accurate  and  succinct  manner. 

The  actual  construction  of  the  test  workload  is  described  in 
Chapter  V.  Some  considerations  and  techniques  for  designing  synthetic 
jobs  are  outlined.  Procedures  for  validating  (verifying  the  accuracy) 
the  synthetic  job  stream  are  also  given. 

Chapter  VI  consists  of  a  detailed  case  study  illustrating  many 
of  the  techniques  outlined  in  previous  chapters.  The  test  case  is  not 
carried  to  conclusion  (i.e.  a  complete  ready-to-run  benchmark)  due  to 


a  need  to  limit  the  scope  of  the  research.  The  details  necessary  to 
carry  it  to  such  a  conclusion  are  outlined. 


The  research  is  examined  with  the  aim  of  producing  a  description 
of  a  fully  automated  test  workload  generator  in  Chapter  VII.  The 
results  of  the  research  are  summarized,  the  more  important  points 
originated  in  this  research  are  delineated,  and  areas  of  future  research 
are  suggested. 
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CHAPTER  II 
LITERATURE  SURVEY 


2.1  Introduction 

The  workload  of  a  computer  system  consists  of  all  individual  jobs 
and  data  that  are  processed  by  the  system  during  a  specified  period  of 
time  [86].  One  of  the  principal  problems  facing  a  researcher  conducting 
a  performance  evaluation  of  a  computer  system  is  representing  the 
system  workload  in  a  form  compatible  with  the  evaluation  techniques 
employed.  It  was  mentioned  earlier  that  the  test  workload  should  be 
representative  of  the  actual  workload  in  order  that  valid  performance 
measures  can  be  obtained;  reproducible  to  allow  replication  of  the 
experiments  and  verification  of  questionable  results;  flexible  to  allow 
easy  modification;  and  portable  to  minimize  the  effort  required  to 
transport  the  workload  between  systems.  The  criteria  for  a  "good"  test 
workload  are,  to  some  degree,  opposing,  requiring  compromise  on  the  part 
of  the  researcher. 

Some  of  the  factors  influencing  the  development  of  a  test  workload 
are  the  selection  of  which  jobs  to  include  in  the  workload  model,  the 
characterization  of  jobs  in  the  real  workload,  and  the  type  of  test 
workload  to  use.  The  approaches  to  this  problem  which  have  been  taken 
in  recent  years  will  be  surveyed  in  this  chapter. 
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2.2  Selection  of  the  Workload 

In  most  evaluation  studies  it  is  not  possible  to  execute  the  entire 
job  profile  on  each  potential  computer  system  to  be  evaluated.  Some 
workloads  are  non-recurrent  in  the  sense  that  there  is  no  readily  dis¬ 
cernible  cyclic  pattern  of  resource  demands.  Other  workloads  have  an 
extremely  long  repetition  cycle  (i.e.  one  week),  hence  inclusion  of  the 
entire  job  profile  for  a  given  cycle  would  not  be  feasible.  This 
requires  that  a  subset  of  the  actual  workload  be  used  in  constructing  a 
test  workload. 

Choosing  which  jobs  are  to  be  included  in  a  test  workload  is  not  a 
well-defined  task.  Hellerman  and  Conroy  [42]  list  three  important  cri¬ 
teria  in  selecting  jobs.  These  are 

(a)  those  jobs  which  are  run  most  frequently, 

(b)  those  jobs  which  account  for  most  of  the  system  time  and 
resource  use,  and 

(c)  those  jobs  whose  completion-time  requirements  are  most 
critical  to  the  system's  mission.  The  identification  of  these  jobs  may 
be  somewhat  difficult. 

Since  the  test  workload  will  normally  be  constructed  using  only  a 
subset  of  the  actual  workload,  one  approach  to  the  selection  of  jobs  is 
to  use  the  techniques  of  statistical  sampling  [83].  Jobs  are  selected 
at  random  from  the  real  workload  for  use  in  constructing  the  test 
workload.  As  with  any  sampling  procedure,  there  is  a  risk  of  obtaining 
a  non-representative  sample,  and  thus  constructing  a  test  workload  which 
does  not  resemble  the  actual  workload. 


Another  approach  to  the  selection  of  the  workload  is  to  divide  the 
actual  workload  into  classes  based  on  job  functions.  Then  a  number  of 
jobs  could  be  selected  from  each  class  based  on  their  proportion  in  the 
total  mix  [52].  This  segregation  of  the  actual  workload  into  classes 
could  be  done  manually,  or  automated  through  clustering  algorithms. 

Still  another  approach  to  the  selection  of  the  workload  is  to  pick 
that  period  of  activity  which  has  the  greatest  influence  on  the  problem 
being  studied.  For  example,  if  the  load  on  the  system  is  being  studied, 
an  obvious  workload  to  consider  is  the  period  of  peak  activity.  It 
should  be  apparent  that  if  this  approach  is  taken,  the  test  workload  will 
not  be  representative  of  the  entire  workload.  This  may,  however,  not 
be  a  serious  constraint  on  the  validity  of  the  study  [13], 

Once  a  subset  of  the  workload  is  selected  for  inclusion  into  the 
workload  model,  data  must  be  collected  which  reflect  the  characteris¬ 
tics  of  the  jobs  included.  System  accounting  logs,  such  as  IBM's 
System  Management  Facility  [47],  or  trace  facilities  supplied  with  the 
system,  such  as  IBM's  Generalized  Trace  Facility  [48],  are  ready  sources 
of  such  data.  If  these  facilities  are  not  available,  data  must  be 
collected  with  a  monitor  [75].  The  first  approach  appears  to  be  the 
more  popular  [4,46,83,91]  since  the  data  is  available  with  essentially 
no  required  modification  to  the  system.  The  second  alternative  has, 
however,  also  been  used  [10]. 

2.3  Characterization  of  the  Workload 

Before  the  characterization  of  a  real  workload  can  be  made,  a  basic 
unit  of  work  must  be  defined.  In  some  evaluation  studies,  the  unit  of 


work  may  be  a  transaction,  while  in  others  it  may  be  a  job  or  job-step. 
Evaluation  studies  involving  a  general  purpose  system  may  utilize  both 
transactions  and  jobs.  Different  types  of  workloads  are  generally 
considered  initially  separate,  however  lend  themselves  to  be  combined 
to  form  a  composite  workload.  In  the  remainder  of  this  section,  the  job 
will  be  adopted  as  the  basic  unit  of  work.  It  should  be  recognized  that 
similar  considerations  apply  for  transactions  in  a  time-sharing/ 
interactive  environment. 

Jobs  or  job  steps  in  a  batch  processing  environment  can  be  described 
by  the  type  of  processing  required,  or  alternately  by  the  demand  they 
place  upon  system  resources  [86].  The  first  approach  is  termed  the 
service  demand  representation,  while  the  latter  is  termed  the  resource 
demand  representation.  When  the  service  demand  approach  is  used,  some 
of  the  typical  processing  requirements  might  be  compilation,  sort-merge, 
or  file  updates  [86].  The  distribution  of  the  total  jobs  among  the 
different  processing  groups  provides  an  indication  of  the  nature  of  the 
workload.  Since  this  description  does  not  depend  on  the  particular  type 
of  computer  system  (i.e.  a  program  which  requires  compilation  on  one 
system  will  generally  require  compilation  on  another),  it  can  be  referred 
to  as  a  system  independent  description.  Independence  of  any  given 
system  means  that  it  can  be  used  in  comparative  evaluations  involving 
heterogeneous  systems.  This  characterization  is  highly  desirable, 
particularly  in  selective  evaluations  in  which  a  potential  customer  is 
attempting  to  decide  which  of  two  or  more  different  vendor's  equipment 
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will  best  satisfy  those  needs.  The  service  demand  representation  is 
rarely  feasible,  since  information  on  the  processing  requirements  for 
each  individual  job  in  the  work  stream  is  difficult,  if  not  impossible, 
to  obtain. 

An  alternate  characterization  is  obtained  if  the  computer  system 
is  viewed  as  a  collection  of  resources  upon  which  the  users  (workload) 
place  demands.  Some  of  the  resources  common  to  many  computer  systems 
with  corresponding  demands  include  the  processor  (CPU  time),  I/O  channels 
and  devices  (number  of  I/O  activities),  core  memory  (size  of  the  region), 
and  unit  record  devices  (number  of  cards  read  or  punched,  number  of 
lines  printed)  [86].  The  demands  for  these  resources  can  be  considered 
as  the  characteristic  variables  of  the  real  workload  processed  by  the 
system.  A  job  can  be  described  by  a  set  of  these  characteristics  [1], 
and  since  the  system  only  recognizes  a  job  by  its  pattern  of  resource 
demands,  two  jobs  with  the  same  resource  demands  would  be  characterized 
and  treated  identically  [86].  It  should  be  noted  that  the  resource 
demands  of  a  given  job  will  vary  from  one  computer  system  to  another. 
Thus,  this  characterization  is  system  dependent,  and  should  be  only  used 
in  comparative  evaluations  involving  homogeneous  systems.  Its  main 
usefulness  would  appear  to  be  in  system  improvement  studies  involving  a 
single  system. 

Regardless  of  whether  the  resource  demand  or  service  demand 
approach  is  used  to  characterize  the  workload,  a  job  can  be  represented 
by  an  n-tuple  v=(Vj,  v2,  ....  vn),  where  v.  represents  the  magnitude 
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of  the  demand  for  the  i —  resource  or  service.  Using  this  representa¬ 
tion,  a  number ot  different  approaches  have  emerged  for  selecting  jobs 
and  setting  the  levels  of  their  demands  for  each  of  the  resources  or 
services.  Ferrari  [32]  describes  five  such  approaches. 

The  first  approach  involves  constructing  the  probability  distribu¬ 
tion  of  the  demand  levels  in  the  real  workload.  By  sampling  these 
distributions,  the  appropriate  demand  for  each  resource  or  service  can 
be  derived  for  each  job  included  in  the  workload  model.  This  method  was 
used  L.y  Schwetman  and  Browne  [81],  and  a  simulation  based  on  this 
technique  was  described  by  Rosen  [78],  The  sampling  technique  used 
would  appear  to  affect  the  representativeness  of  a  workload  description 
produced  by  this  method. 

The  second  approach  is  to  extract  real  jobs  from  the  real  workload 
by  sampling  the  workload.  The  resource/ service  demands  for  these 
sampled  jobs  are  used  to  characterize  the  jobs  in  the  workload  model. 

This  method  has  been  used  by  Shope,  et  al.  [83]  and  Wood  and  Forman  [91]. 

The  third  approach  mentioned  by  Ferrari  [32]  is  to  partition  the 
real  workload  into  classes,  each  characterized  by  similar  combinations 
of  resource/service  demand  patterns.  A  suitable  number  of  jobs  can 
then  be  selected  from  each  class,  and  the  resource  demands  for  these 
jobs  used  to  characterize  a  job  in  the  model.  This  approach  has  been 
used  by  Joslin  [51],  Hunt,  et  al.  [46],  Agrawala,  et  al .  [4]  and 
Mamrak  and  Amer  [66]. 

The  fourth  general  approach  is  to  construct  the  joint  probability 
distribution  of  the  parameters  in  the  real  workload  (i.e.  resource/ 


service  demands)  and  derive  from  this  distribution  the  parameters  of  a 
set  of  jobs  with  the  same  distribution.  Sreenivasan  and  Kleinman  [£5] 
proposed  this  method  and  applied  it  to  the  construction  of  a  test  work¬ 
load  for  a  batch-process. •’g  installation.  The  major  drawback  to  this 
method  would  appear  to  be  that  if  a  number  of  parameters  are  present, 
the  joint  distribution  becomes  difficult  to  manage.  The  last  technique 
considers  a  job  as  a  Markov  process  in  which  the  states  of  a  job  are 
specified  in  terms  of  the  values  or  ranges  of  values  of  its  resource/ 
service  demands.  The  state-transition  probability  matrices  for  the  real 
workload  are  constructed  and  used  to  derive  the  sequences  of  values 
for  each  job's  parameters.  This  approach  was  investigated  by  Lasseter, 
et  al .  [62]  and  a  model  using  this  approach  was  implemented  by  Lindsay 
[63].  A  recent  work  [70]  investigated  the  modelling  of  a  job  in  which 
the  states  of  the  Markov  model  were  the  types  of  programs  being  executed 
during  each  succeeding  job-step. 

Regardless  of  which  of  the  approaches  is  used,  the  result  should 
be  a  workload  model  stated  in  parametric  form.  That  is,  the  real 
workload  will  be  represented  as  a  series  of  jobs,  each  of  which  has 
a  certain  pattern  of  resource/service  demands. 

2 • 4  Types  of  Test  Workloads 

Test  workloads  can  be  classified  as  executable  or  non-executable 
[32]  depending  upon  whether  they  are  intended  for  use  in  empirical 
studies  or  analytical/simulation  studies.  Non-executable  workloads  are 
of  two  general  types.  The  first  type  is  the  probabilistic  or  distri¬ 
butional  workload.  In  this  approach,  the  requests  for  resources  or 
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services  are  represented  as  probability  distributions.  The  real  distri¬ 
bution  of  these  demands  is  often  approximated  by  such  standard  distribu¬ 
tions  as  the  geometric  distribution,  or  the  hyperexponential  distribution 
[88].  The  closeness  of  this  fit  is  obviously  a  factor  in  the  degree  of 
representativeness  achieved.  This  approach  has  been  used  by  many 
researchers  [8].  Since  the  thrust  of  this  research  is  not  toward 
analytical/simulation  studies,  no  in-depth  review  of  this  representation 
was  made. 

Alternately,  the  test  workload  may  be  script  of  system  demands 
based  upon  the  observed  requests  of  a  previously  executed  workload. 

This  approach  is  called  a  trace,  since  it  traces  out  the  set  of  demands 
of  a  previously  executed  workload.  A  trace  may  be  so  detailed  as  to 
indicate  each  individual  machine  instruction  executed,  or  be  a  series  of 
aggregate  demands  placed  on  combinations  of  system  resources  [22,67,90]. 
As  pointed  out  by  Svobodova  [88],  the  representativeness  of  a  system 
trace  can  be  affected  by  the  artifact  introduced  in  the  monitoring 
process.  Again,  since  this  approach  is  of  use  in  analytical/simulation 
studies,  no  detailed  review  was  undertaken. 

The  most  obvious  choice  for  a  test  workload  in  considering  execut¬ 
able  workloads,  is  to  use  the  actual  workload  (or  a  subset  of  this 
workload)  that  the  users  submit.  In  this  case,  the  test  workload  is 
known  as  a  benchmark.  Benchmarks  reflect  demands  the  users  make  on  the 
system,  and  these  user  demands  must  be  translated  into  demands  for 
system  resources.  Natural  workload  models  (benc^arks)  have  been 
investigated  and  used  in  a  number  of  studies  [14,51,79,87].  Their 


use,  however,  has  a  number  of  drawbacks.  These  include 

(a)  the  drive  workload  is  not  flexible  since  it  is  constructed 
from  jobs  with  fixed  characteristics, 

(b)  large  amounts  of  data  on  auxiliary  storage  may  need  to  be 
duplicated  to  enable  the  running  of  some  real  jobs  and 

(c)  security  or  privacy  considerations  may  prevent  the  use  of 
some  jobs  [86].  These  and  other  considerations  have  led  to  the  inves¬ 
tigation  of  alternate  forms  of  executable  workloads. 

An  instruction  mix  is  an  artificially  constructed  job  which  is 
composed  of  a  precise  mix  of  certain  types  of  instructions.  This  type 
of  test  job  was  one  of  the  first  artificial  models  suggested  for  use 
in  performance  studies  [32]  and  it  is  useful  in  comparing  the  relative 

throughput  of  processors  [88].  The  most  common  mix  is  the  Gibson 
mix  [35],  although  numerous  others  have  been  suggested  [33,45].  There 
are  some  disadvantages  to  using  instruction  mixes  which  tend  to 
severely  restrict  their  applicability.  These  disadvantages  include 
that  their  use  is  restricted  to  comparing  systems  with  similar  instruc¬ 
tion  sets,  and  that  they  fail  to  account  for  input-output  [42]. 

Another  model  which  has  been  used  to  represent  jobs  in  a  test 
workload  is  the  standard  job  or  kernel .  These  artificial  jobs  are 
constructed  to  exhibit  a  particular  behavior,  and  thus  they  can  not 
be  easily  modified.  They  are  of  use  when  a  projection  of  the  workload 
is  needed.  They  have  also  been  used  to  compare  the  relative  performance 
of  language  translators.  Many  collections  of  standard  jobs  exist  [44], 

A  type  of  artificially  constructed  executable  workload  which  has 


received  considerable  attention  in  recent  years  is  the  synthetic  job. 

A  synthetic  job  is  a  program  which  does  not  perform  any  "useful"  com¬ 
puting,  but  when  executed  results  in  demands  for  system  resources 
similar  to  the  demands  of  the  actual  workload.  Synthetic  jobs  are 
generally  written  in  a  high  level  language  with  parameters  which  allow 
for  easy  modification.  These  parameters  normally  allow  the  user  to 
specify  the  size  of  the  program,  amount  of  CPU  time  used,  number  and 
types  of  files  accessed,  and  the  amount  of  I/O  performed.  Thus, 
similar  to  benchmarks,  synthetic  jobs  represent  the  workload  from  the 
user's  point  of  view.  The  use  of  synthetic  jobs  overcomes  many  of  the 
disadvantages  of  benchmarks.  Resource-oriented  synthetic  jobs  are 
typified  by  the  single  adjustable  job  proposed  by  Buchholz  [18]. 

Wood  and  Forman  [91],  and  Sreenivasan  and  Kleinman  [86]  have  success¬ 
fully  used  the  Buchholz  job  for  constructing  synthetic  test  workloads. 
Curnow  and  Wichmann  [25]  developed  an  Algol  job  to  simulate  many  com¬ 
putational  procedures.  Oliver  et  al.  [76]  developed  a  series  or  five 
simple  synthetic  jobs  and  experimented  with  them  in  producing  synthe¬ 
tic  workloads.  Functionally  oriented  synthetic  jobs  have  been  described 
by  Joslin  [51]  and  Lucas  [65].  For  interactive  or  time-sharing  environ¬ 
ments,  the  synthetic  jobs  are  typically  developed  from  scenarios  that 
speci^/  system-independent  functional  activities  and  include  a  desig¬ 
nation  of  all  actions,  pauses  and  decisions  made  by  the  user.  Work 
in  developing  approximately  representative  test  workloads  for  inter¬ 
active  systems  has  been  done  by  Karush  [53],  Nolan  and  Strauss  [74], 
Wright  and  Burnette  [92]  and  Crothers  [24]. 
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2.5  Validation  of  Test  Workloads 

The  results  of  the  workload  characterization  process  described  in 
section  2.3  should  be  a  model  of  the  workload.  The  test  workload  then 
can  be  generated  from  this  workload  model  through  suitable  represen¬ 
tation  of  the  jobs  making  up  the  model.  Since  the  aim  of  constructing 
a  workload  model  is  to  obtain  a  representative  test  workload,  the 
validity  of  the  workload  model  should  be  assessed.  Agrawala  et  al.  [4] 
describe  validation  of  workload  models  obtained  through  the  clustering 
approach.  They  suggest  that  the  workload  model  should  be  constructed 
from  one  set  of  data  (the  design  set)  and  validated  using  a  second  set 
of  data  (the  test  set).  The  method  of  hypothesis  testing  [43]  is 
suggested  for  use  in  such  a  validation  process. 

Ferrari  [32]  discusses  validation  of  the  test  workload.  The 
procedure  suggested  involves  the  execution  of  the  test  workload  on  the 
system  being  tested.  The  pattern  of  resource  demands  made  by  the  test 
workload  is  then  compared  to  the  pattern  made  by  the  real  workload. 

This  validation  procedure  was  followed  by  Schwetman  and  Browne  [81] 
and  Kernighan  and  Hamilton  [55].  Ferrari  [32]  suggests  that  secondary 
performance  indices,  in  addition  to  those  primary  indices  which  were 
used  in  constructing  the  test  workload,  be  included  for  use  in  valida¬ 
tion. 


2 . 6  Summary 

Various  approaches  used  to  generate  representative  test  workloads 
have  been  surveyed  in  this  chapter.  It  should  be  apparent  from  the 
number  of  approaches  surveyed  that  there  is  no  widespread  committment 
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to  any  one  single  method.  However,  the  approach  which  combines  charac¬ 
terization  using  clustering  analysis  and  implementation  using  synthetic 
jobs  appears  to  be  gaining  favor  as  the  most  promising  approach. 

The  characterization  of  workloads  and  its  impact  on  computer 
performance  studies  is  still  not  well  understood.  Those  approaches 
which  were  surveyed  have  not  passed  the  test  of  time.  That  is,  in 
most  cases,  they  are  single  examples  of  possibly  useful  procedures. 

Until  they  are  used  by  other  researchers,  they  remain  simply  suggestions 
on  how  one  might  proceed. 


CHAPTER  III 


SELECTING  THE  WORKLOAD 


3.1  Introduction 

The  workload  of  most  general  purpose  computing  systems  is  dynamic 
in  the  sense  that  it  cannot  be  represented  as  a  cyclic  demand  for  re¬ 
sources  with  a  manageable  repetition  period.  Furthermore,  the  needs  of 
a  user  community  historically  have  tended  to  grow  to  match  or  exceed 
the  capacity  of  the  computing  system.  Though  the  type  of  computing 
done  may  not  change  dramatically,  the  number  of  users  and  their 

frequency  of  use  will  steadily  increase  over  the  life  of  a  system 
[46]. 

The  dynamic  nature  of  the  workload  of  a  computer  system  basically 
reflects  the  diversity  of  users.  For  example,  in  a  large  university 
environment,  jobs  submitted  to  the  computer  system  could  include 
instructional  jobs,  research  jobs,  administrative  jobs  (i.e.  grade 
reports),  commercial  jobs,  and  overhead  jobs  (i.e.  billing,  etc.).  The 
resource  demand  characteristics  of  these  various  classes  of  jobs  may 
be  radically  different.  Instructional  jobs  are  generally  small  jobs, 
which  individually  use  minimal  resources,  but  due  to  the  sheer  number 
of  such  jobs  in  the  job  mix,  they  become  a  significant  part  of  the 
workload.  Research  jobs,  on  the  other  hand,  are  much  larger  jobs, 
hence  individually  account  for  a  greater  share  of  resource  use  than 
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instructional  jobs.  Administrative  and  overhead  jobs  may  be  difficult 
to  categorize  since  they  are  run  for  many  different  reasons.  It  should 
generally  be  apparent  that  they  make  liberal  use  of  input-output  (I/O) 
facilities,  since  most  billing/report  functions  require  heavy  access  to 
stored  data  files. 

Not  only  are  the  computational  requirements  of  the  various  classes 
of  jobs  different,  the  frequency  and  timing  of  runs  may  be  significantly 
different.  The  frequency  and  timing  of  instructional  jobs  are  influ¬ 
enced  by  factors  such  as  the  beginning/end  of  the  semester,  when  the 
particular  programming  assignment  is  due,  and  even  the  schedule  of 
extra  curricular  activities.  Research  jobs  are  influenced  by  such 
things  as  project  deadlines.  Administrative/overhead  jobs  may  be 
considered  cyclic,  since  they  are  generally  run  at  about  the  same  time 
each  month/semester.  The  pattern  of  submissions  of  all  classes, 
except  possibly  overhead  jobs  can  be  affected  by  various  operational 
strategies  such  as  reduced  rates  at  particular  times  of  the  day. 

The  diverse  nature  of  the  workload  (i.e.  various  types  of  jobs  and  dif¬ 
ferent  arrival  patterns)  hinders  any  characterization  effort. 

3.2  Constructing  a  Workload  Model 

A  problem  which  has  received  a  great  deal  of  attention  [3,4,5, 
12,41,46,66]  is  the  establishment  of  a  model  of  a  computer  workload. 

This  is,  in  a  sense,  an  attempt  to  characterize  the  users  of  a  computer 
system.  Such  a  model  is  important  from  the  viewpoint  of  management 
[46]  since  it  aids  in  planning.  That  is,  if  the  characteristics  of 


the  user  population  are  known,  projections  can  be  made,  and  orderly 
expansion  or  replacement  of  the  present  system  may  be  facilitated. 

The  approach  taken  to  solve  this  problem  has  been  statistical  sampling 
The  workload  of  the  computer  system  is  observed  over  some  period  of 
time  (i.e.  a  day,  a  month  or  a  year).  Random  sampling  of  this  collec¬ 
tion  of  jobs  is  then  performed  to  achieve  a  representative  collection 
of  jobs.  This  reduced  collection  is  then  analyzed  to  discern  under¬ 
lying  characteristics.  These  underlying  characteristics  are  then 
inferred  to  the  population  as  a  whole.  There  are  a  number  of  difficul 
ties  associated  with  such  an  approach.  Among  these  are: 

(a)  A  significant  part  of  the  workload  may  be  in  the  form  of 
a  relatively  small  number  of  extremely  large  jobs.  These  may  be 
excluded  from  the  model  merely  by  chance. 

(b)  The  workload  of  a  computer  system  is  generally  not  static  in 
time.  That  is,  a  workload  model  constructed  using  data  from  a  parti¬ 
cular  period  of  time  may  not  even  resemble  the  workload  present  at 
some  other  time,  particularly  with  respect  to  the  relative  proportion 
of  various  job  classes  represented  in  the  model. 

Even  if  a  representative  workload  model  can  be  constructed, 
there  are  other  difficulties  which  minimize  the  usefulness  of  such  a 
model  to  construct  a  test  workload  for  use  in  a  performance  evaluation 
study.  Some  of  these  difficulties  are  detailed  in  the  next  two 
sections. 

3.3  Environmental  Impact  on  Resource  Demands 

The  resource  demand  pattern  for  a  given  workstep  (i.e.  job. 
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transaction,  or  job  step)  is  to  some  degree  dependent  upon  its  envir¬ 
onment.  There  are  the  obvious  differences  in  the  timing  of  the 
resource  demands.  The  recorded  magnitudes  of  the  demands  may  also 
vary  significantly  from  one  run  to  the  next  of  the  same  program 
depending  on  the  system  loading  at  the  time.  This  difference  in 
resource  usage  (and  hence  in  the  amount  charged)  from  one  run  to  the 
next  of  the  same  program  may  be  baffling  and  sometimes  annoying  to 
the  user;  it  must  also  be  considered  when  constructing  test  workloads 
for  performance  evaluation  studies.  That  is,  a  workstep  which  is 
removed  from  its  environment  and  included  in  a  sample  for  an  evalu¬ 
ation  study  may  exhibit  a  decidedly  different  resource  demand  pattern 
in  its  new  environment. 

An  example  of  a  particular  resource  demand  which  is  subject  to 
environmental  variations  is  the  amount  of  central  processor  (CPU) 
time  required  to  complete  a  task.  In  a  recent  study,  Davies  [26] 
reported  significant  variance  in  the  recorded  CPU  time  for  the  same 
compute-bound  program  run  under  differing  degrees  of  system  loading. 

It  was  found  that  the  recorded  CPU  time  tended  to  increase  as  the 
loading  on  the  system  got  heavier.  There  are  two  sources  cited  for 
this  variance  in  CPU  times.  The  first,  referred  to  as  the  "true  vari¬ 
ation"  is  due  to  differences  between  runs  in  such  things  as  cache  per¬ 
formance,  paging  behavior  in  virtual  memory  systems,  and  memory  access 
speed  if  processing  is  overlapped  with  "cycle-stealing".  The  second 
cited  source  of  variation  is  due  to  the  non-repeatability  of  how  the  system 
charges  time  to  user  processes,  system  overhead,  and  the  idle  state. 


32 


The  charging  algorithm  varies  from  system  to  system.  IBM  [47]  recog¬ 
nizes  the  possibility  of  variations  in  CPU  time  between  two  runs  of  the 
same  program,  and  attributes  it  to  such  factors  as  channel  program 
retries,  CPU  architecture  (core  buffering),  cycle  stealing  with  inte¬ 
grated  channels,  queue  searching  (such  as  task  switching)  and  pending 
interrupts.  Although  in  many  cases  the  variation  of  CPU  time  between 
two  runs  of  the  same  program  may  be  small,  Davies  [26]  cites  one 
instance  in  which  two  runs  of  the  same  program  produced  r  ^  i  CPU 
times  in  the  ratio  of  1:2. 

A  second  resource  demand  which  is  subject  to  large  environmental 
variations  is  paging  behavior.  Paging  activity  is  influenced  by  two 
factors:  program  construction,  and  system  environment.  A  program  which 
exhibits  a  high  degree  of  locality  of  reference  [23,85]  will  generally 
not  incur  as  much  paging  activity  as  one  which  does  not  have  this 
property.  This  generally  will  have  no  impact  on  selecting  an  appropri¬ 
ate  workload  since  the  structure  of  programs  are  not  normally  modified. 
It  will,  however,  have  an  influence  on  the  development  of  synthetic 
jobs  which  is  considered  later.  The  opportunity  for  environmental  vari¬ 
ations  in  the  paging  behavior  of  a  program  becomes  clear  when  a  parti¬ 
cular  paging  strategy  is  considered.  Consider,  for  example,  the  Least- 
Recently-Used  (LRU)  paging  algorithm  [23].  This  is  a  demand-paging 
algorithm  in  that  a  page  is  only  read  into  main  memory  when  a  reference 
to  it  is  made.  As  long  as  main  memory  is  not  full,  no  replacement  of 
pages  is  made.  When  physical  memory  is  full,  a  strategy  is  employed 
to  decide  which  of  a  program's  pages  are  to  be  "rolled-out"  to  free 
space  to  read  in  the  next  referenced  page.  The  LRU  algorithm  assumes 


locality,  and  replaces  that  page  which  has  not  been  referenced  for  the 
longest  period  of  time.  If  and  when  that  page  is  again  referenced,  it 
must  be  read  back  into  main  memory.  Thus  if  a  program  is  executing 
in  an  environment  in  which  main  memory  is  not  fully  used,  it  is  likely 
to  incur  fewer  page  faults  than  if  it  is  executing  in  a  heavily  loaded 
environment  in  which  some  of  its  pages  have  to  be  "rolled  out"  and 
then  "rolled  back  in"  upon  the  next  reference  to  them.  This,  of  course, 
can  result  in  widely  varying  channel  utilization  rates  as  well  as  con¬ 
tributing  to  variations  in  CPU  time  and  I/O  time. 

3 . 4  Selection  of  ar^  Appropriate  Workload  Subset 

A  performance  evaluation  study  is  normally  conducted  for  a  speci¬ 
fic  purpose.  Studies  performed  on  a  single  system  could  involve  such 
things  as  assessing  the  impact  that  various  dispatching  strategies  have 
on  the  average  turnaround  time;  assessing  the  effect  that  a  different 
page  replacement  strategy  would  have  on  paging  behavior;  or  assessing 
the  impact  that  adding  another  increment  of  physical  memory  will  have 
on  the  behavior  of  a  virtual  memory  machine.  Obviously,  one  would  like 
the  test  workload  to  exhibit  certain  properties  to  enhance  the  study. 

For  example,  if  the  evaluation  study  involves  assessing  the  relative 
behavior  of  two  page  replacement  rules,  and  a  test  workload  is  employed 
which  does  not  fully  utilize  physical  memory,  the  results  of  the  study 
are  likely  to  be  less  than  satisfactory.  One  must,  then,  match  the 
test  workload  to  the  evaluation  study  to  some  degree. 

Workload  periods  which  are  apt  to  be  of  interest  in  evaluation 
studies  are  likely  to  be  extreme  periods.  That  is,  the  analyst  wishes 
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to  examine  the  system  when  some  feature  of  it  is  heavily  loaded.  This 
fact,  along  with  the  failure  to  account  for  environmental  variations, 
would  seem  to  severely  limit  the  applicability  of  system  workload 
models  constructed  by  statistical  sampling  to  performance  evaluation 
studies.  That  is,  it  is  highly  unlikely  that  one  could  achieve  a  job 
mix  which  would  "strain"  the  system  in  the  desired  manner  through 
random  sampling. 

An  alternative  to  constructing  the  system  workload  model  is  to 
select  a  period  of  system  activity  which  exhibits  the  desired  charac¬ 
teristics,  and  use  that  period  for  the  evaluation  study.  Not  only  is 
the  desired  characteristic  present,  but  the  environment  has  been  pre¬ 
served,  which  would  minimize  the  problem  of  environmental  variations. 
There  is  some  sacrifice  made  if  this  procedure  is  followed,  however. 

That  is  that  since  there  is  no  randomization  in  the  assignment  of  work- 
steps  to  the  test  wo'xload,  one  cannot  expect  that  the  workload  is 
representative  of  the  entire  real  workload.  Hence,  inferences  of  system 
behavior  must  be  restricted  to  at  most  similar  periods.  This  may  not 
be  too  small  a  price  to  pay  when  compared  with  the  alternatives. 

Detection  of  abnormal  system  activity  which  may  be  of  interest  in 
performance  evaluation  studies  is  rather  a  trivial  task.  System 
accounting  logs  normally  contain  summary  data  on  system  activity  at  a 
level  appropriate  for  such  detection.  That  is,  information  on  the 
number  of  jobs  processed,  memory  utilization,  CPU  utilization,  etc.,  on 
a  per  hour  or  per  day  basis  is  recorded  for  management  information. 

This  data  can  be  summarized  and  displayed  in  the  form  of  a  gross  system 
profile.  Abnormal  periods  are  usually  apparent  from  such  profiles. 


Once  the  appropriate  period  is  selected,  it  can  be  examined  in  more 
detail  to  insure  that  it  does  indeed  possess  the  required  characteris¬ 
tics.  Although  they  did  not  use  it  for  this  purpose,  Bear  and  Reeves 
[12]  describe  the  system  workload  of  the  CDC  CYBER  74  system  at  Wright- 
Patterson  AFB,  Ohio  at  a  level  which  would  be  appropriate  for  selection 
of  interest  periods.  A  similar  profile  of  the  workload  of  the 
Amdahl  470/V6  at  Texas  A&M  University  is  illustrated  in  the  case  study 
of  Chapter  VI. 

3.5  Selecting  Descriptors  for  the  Worksteps 

Once  the  period  of  interest  has  been  selected,  a  set  of  descrip¬ 
tors  by  which  real  jobs  can  be  represented  must  be  selected.  If  system 
logs  are  used  to  obtain  data  on  the  real  workload,  this  involves 
deciding  which  of  the  recorded  items  are  essential  to  characterize  each 
job's  demand  on  the  system.  If  a  monitor  is  used  to  collect  the  data, 
this  determination  must  be  made  prior  to  the  installation  of  the  moni¬ 
tor,  to  allow  for  the  collection  of  appropriate  data. 

The  number  of  descriptors  used  to  characterize  each  job  will,  in 
general,  have  a  dramatic  impact  on  the  representativeness  of  the  gener¬ 
ated  test  workload.  That  is,  if  too  few  descriptors  are  used,  the 
analyst  cannot  hope  to  faithfully  reproduce  the  system  behavior.  If  too 
many  decriptors  are  included,  on  the  other  hand,  the  analysis  of  the 
workload  data  is  complicated. 

Ideally,  if  the  resource  demand  description  of  workload  is  ap¬ 
plied,  the  workstep  descriptors  should  completely  specify  the  demands 
placed  upon  the  various  system  resources.  Some  of  the  resources  upon 
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which  jobs  place  varying  degrees  of  demand  are 

(a)  central  processing  unit  (CPU), 

(b)  I/O  processors  (channels), 

(c)  main  memory,  and 

(d)  peripheral  devices. 

The  demand  placed  upon  some  of  these  resources  are  easier  to  characterize 
than  others.  For  example,  the  demand  placed  upon  the  CPU  is  reflected 
in  the  elapsed  time  the  CPU  spends  in  the  execution  of  the  job.  The  demand 
placed  upon  main  memory  can  be  measured  by  the  size  of  the  maximum 
partition  used  by  the  job,  the  average  partition  size  used,  or  if  great¬ 
er  resolution  is  desired,  the  weighted  sum  of  the  various  partition 
sizes  and  the  time  each  such  partition  is  utilized. 

The  characterization  of  the  demands  placed  upon  I/O  channels  and 
peripheral  devices  is  somewhat  more  difficult.  There  are  normally  a 
myriad  of  peripheral  devices  attached  to  a  general  purpose  computer 
system.  It  is  highly  unlikely  that  an  evaluation  study  would  require 
resolution  to  the  extent  of  measuring  the  demands  placed  upon  each 
individual  device.  A  reasonable  measure  would  appear  to  be  the  amounts 
of  each  particular  type  of  I/O  activity  (i.e.  tape,  disk,  unit  record) 
done  by  the  job.  Most  system  accounting  logs  reflect  a  number  of  mea¬ 
sures  of  I/O  activity.  These  include  I/O  time,  as  well  as  the  number  of 
data  transfers  initiated  on  each  channel.  Though  the  number  of  data 
transfers  is  not  a  direct  measure  of  channel  activity  since  varying 
amounts  of  data  can  be  transferred,  it  may  be  sufficient  in  many 
evaluation  studies.  For  those  requiring  more  precision,  the  system 
accounting  data  can  be  augmented  with  hardware  monitor  data  reflecting 
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the  average  channel  activity  per  data  transfer  [91]. 

Previous  workload  characterization  efforts  reflect  a  multitude  of 
descriptor  sets  used  to  characterize  the  demands  placed  on  system 
resources  by  individual  jobs.  Sreenivasan  and  Kleinman  [86]  used  only 
two  variables,  CPU  seconds  and  the  total  number  of  data  transfers 
initiated  (EXCP  count).  A  third  variable,  amount  of  core  utilized,  was 
recognized  as  important  but  it  was  found  that  the  vast  majority  of  jobs 
required  similar  amounts  of  memory.  For  this  reason,  it  was  not  in¬ 
cluded  in  the  descriptor  set.  Hunt  [46]  used  eight  descriptors:  cards 
read,  lines  printed,  CPU  time.  Peripheral  Processor  Unit  (PPU)  time, 
central  memory,  tape  drives  charged,  cost  to  user,  and  whether  or  not 
FORTRAN  was  used.  Agrawala,  et  al .  [3,4,5]  used  eight  features:  CPU 
time,  executive  request  and  control  card  charges,  average  number  of 
512-word  core  blocks  used,  number  of  job  steps  (programs)  executed,  wall 
clock  time,  I/O  to  FASTRAND  or  disk  devices,  I/O  to  tape,  and  I/O 
to  high-speed  drum  devices.  Mamrak  and  Amer  [66]  summarized  the  work¬ 
load  using  seven  features:  CPU  time,  disc  EXCPs,  tape  EXCPs,  cards 
read,  lines  printed,  DD  cards,  and  core  used  in  kilobytes,  where  an 
EXCP  reflects  an  I/O  request  and  a  DD  card  (data  and  device  specification 
card)  reflects  a  file  accessed. 

As  can  be  seen  from  the  above  examples,  there  is  no  widespread 
agreement  as  to  what  constitutes  a  valid  feature  set  for  use  in  charac¬ 
terizing  the  resource  demands  placed  upon  a  computer  system  by  a  parti¬ 
cular  job.  The  problem  appears  to  be  somewhat  dependent  upon  the 
particular  system  in  use  and  involves  considerable  intuition  c-;  the 
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part  of  the  analyst  performing  the  study.  A  different  set  of  descrip¬ 
tors,  along  with  some  justification  for  its  use  is  described  in  the 
case  study  in  Chapter  VI. 

3.6  Collecting  Data  for  Construction  of  the  Test  Workload 

Once  the  particular  subset  of  the  real  workload  applicable  to  the 
evaluation  study  is  selected,  and  an  appropriate  feature  set  formulated, 
the  data  reflecting  the  feature  values  for  each  workstep  in  the  subset 
must  be  gathered.  It  may  be  that  data  collection  is  done  before  the 
determination  of  an  appropriate  feature  set  or  vice  versa.  These  two 
phases  are  certainly  complementary,  since  it  will  do  no  good  to  choose 
a  feature  which  cannot  be  measured  and  it  is  a  waste  of  resources  to 
collect  data  on  features  which  are  not  used  in  characterizing  the  work¬ 
load. 

If  monitors  are  used  to  collect  resource  demand  data,  there  is 
a  need  to  be  able  to  project  when  in  the  future  the  system  workload 
may  exhibit  similar  characteristics  to  the  period  selected  for  the 
study.  That  is,  the  period  of  interest  for  an  evaluation  study  is 
normally  selected  using  historical  data  in  the  form  of  a  system  profile. 
Unless  the  monitor  was  installed  and  data  collected  during  that  parti¬ 
cular  period,  which  is  unlikely,  a  period  in  the  future  likely  to 
exhibit  the  same  characteristics  must  be  projected,  so  the  monitor  can 
be  "turned  on"  to  collect  the  appropriate  data.  It  then  must  be  veri¬ 
fied  after  the  data  is  collected  if  in  fact  the  projected  period 
exhibited  the  desired  characteristics.  This  problem,  as  well  as  the 
added  cost  of  using  a  monitor,  has  caused  most  researchers  attempting 


to  construct  test  workloads  to  use  system  accounting  data. 

The  case  for  using  system  accounting  data  in  characterizing  the 
resources  used  by  a  particular  job  is  strong.  First,  the  user  is 
charged  according  to  the  usage  reflected  in  these  logs.  Thus,  at  least 
from  the  point  of  management,  the  logs  reflect  the  usage  of  critical 
resources.  Second,  the  data  is  collected  already  for  other  purposes. 

The  system  analyst  then  obtains  the  data  essentially  without  cost, 
either  "out-of-pocket"  or  in  terms  of  additional  overhead  to  the  system. 
Techniques  for  the  collection  of  data  as  well  as  the  types  of  daca  avail¬ 
able  from  the  system  logs  at  Texas  A&M  University  are  considered  in 
the  case  study  of  Chapter  VI. 

3 . 7  Summary 

The  selection  of  an  appropriate  subset  of  the  real  workload  to 
use  in  a  system  performance  evaluation  study  is  one  of  the  first 
decisions  the  analyst  attempting  to  construct  a  test  workload  must 
consider.  The  subset  selected  must  exhibit  certain  characteristics  to 
enhance  the  evaluation  study  being  conducted.  Previous  approaches 
based  upon  statistical  sampling  are  not  likely  to  yield  the  desired 
workload,  since  they  fail  to  account  for  environmental  impacts  on  the 
resource  demands,  and  may  exclude  certain  key  parts  of  the  workload. 

An  alternative  is  to  construct  a  system  profile  using  system  accounting 
data,  examine  that  profile  to  detect  particular  desired  loading  charac¬ 
teristics,  and  use  all  or  a  portion  of  the  actual  workload  during  that 
period  in  the  study.  The  environment  is  thus  preserved,  and  the  analyst 
is  assured  that  the  particular  behavior  of  the  system  being  studied 
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will  be  induced  by  the  workload. 

Once  the  subset  of  the  workload  is  selected,  the  demands  placed 
upon  the  various  system  resources  by  individual  jobs  must  be  quantified 
through  the  selection  of  a  set  of  descriptors.  This  choice  involves 
achieving  a  balance  between  the  resolution  of  the  precise  resources 
used  and  the  computational  complexity  in  the  analysis  phase.  Collection 
of  data  reflecting  the  real  workload  values  for  the  descriptor  set 
selected  is  the  last  task  associated  with  this  initial  phase.  System 
accounting  logs  provide  a  readily  available  source  of  data,  and 
normally  provide  adequate  information  on  resource  utilization. 
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CHAPTER  IV 

ANALYZING  THE  WORKLOAD 


4.1  Introduction 

The  techniques  outlined  in  Chapter  III  will  produce  a  subset  of  the 
real  workload  which  can  be  used  to  construct  a  test  workload  for  use  in 
a  performance  evaluation  study.  This  subset  is  represented  as  a  number 
of  jobs,  each  described  by  some  set  of  descriptors.  The  time  of  arrival 
to  the  system,  possibly  the  originating  location  if  operating  in  a  dis¬ 
tributed  environment,  and  the  appropriate  values  of  the  descriptors 
form  a  complete  specification  of  each  job's  contribution  to  the  over¬ 
all  workload  of  the  system.  A  test  workload  can  be  generated  by  re¬ 
placing  each  of  the  jobs  on  a  one-to-one  basis  with  synthetic  jobs 
which  exhibit  the  same  or  similar  resource  demands.  This,  however,  can 
prove  to  be  an  extremely  trying  task  if  a  large  number  of  jobs  are 
included  in  the  workload  subset.  It  requires  designing  a  separate 
synthetic  job  to  replace  each  real  job  in  the  subset.  Previous  studies 
[3,4,5,30,46,66,86]  have  shown  that  the  workloads  of  computer  systems 
tend  to  be  composed  of  a  relatively  small  number  of  job  classes,  with 
resource  demands  similar  within  each  class.  If  such  classes  are 
present,  the  effort  required  in  constructing  the  test  workload  will 
be  considerably  diminished,  since  one  synthetic  job  can  generally  be 
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used  to  represent  all  members  within  a  class. 

Thus,  there  is  the  need  for  analysis  of  the  real  workload  subset 
to  detect  and  isolate  those  jobs  which  exhibit  similar  resource  demand 
characteristics.  This  chapter  will  outline  a  statistical  clustering 
methodology  useful  in  such  an  analysis. 
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V(Xj)  where  Xj  is  the  mean  and  V (X j )  is  the 


V(X’)  =  - 5 - - 

J  (X  -  X  )2 
'  j  max  j  min' 

variance  of  the  original  unsealed  variables.  This  approach  has  been 
used  in  at  least  one  study  [66]  to  remove  the  dependence  upon  units 
from  the  workload  data. 

An  alternate  approach  to  scaling  the  variables  was  taken  by 
Agrawala,  et  al .  [3,4,5].  They  defined  Xja  to  be  the  a-tile  of  the 
observed  values  of  Xj,  and  then  linearly  scaled  using 

x.  10<*i  -  xi  min’ 

j  <xj“  -  XJ  n,1n>  ‘ 

This  results  in  a  feature  space  in  which  100a%  of  the  observed  data 
points  lie  in  the  interval  from  0  to  10.  For  example,  if  a  is  chosen 
as  .98,  98%  of  the  transformed  values  will  lie  in  the  interval  from 
0  to  10  [3,4,5],  The  stated  purpose  behind  such  scaling  is  to 
produce  an  essentially  uniform  feature  space  which  is  not  distorted 
by  the  presence  of  outliers.  The  mean  of  the  j—  descriptor  variable 

10(7.  -  X.  .) 

under  this  scaling  is  X.  =  - J J  ,  while  the  variance  is 

0  (X.«  -  X, 
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v(Xi)  = - V(X  ). 
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A  third  approach  to  scaling  is  to  standardize  the  variables.  That 
is,  to  scale  each  of  the  j  variables  to  mean  0,  variance  1.  This  is 
accomplished  by  the  relation 

•  xi  -  xi 

X.  =  - J-  .  This  transformation, 

J  /VT7TT 

although  it  has  not  been  applied  (at  least  as  can  be  determined)  in 


workload  studies,  is  probably  the  most  common  transformation  in  statis¬ 
tical  studies. 

There  are  a  host  of  other  transformations  which  could  be  applied 
to  workload  data  to  remove  the  unit  dependence  and  provide  commensurable 
ranges  for  the  descriptor  variables.  There  does  not  appear  to  be  a 
clear  cut  choice  among  the  transformations  since  they  are  computa¬ 
tionally  similar  and  all  accomplish  the  basic  purpose.  Standardization 
provides  some  side  benefits.  That  is,  if  this  means  of  scaling  is 
used,  the  scaled  data  measures  the  variability  in  terms  of  standard 
deviation  units.  Furthermore,  since  the  original  data  is  expressed  in 
widely  different  units,  this  means  of  scaling  is  preferred  as  a 
prelude  to  a  principal  components  analysis  [2,72]. 

4 . 3  Accounting  for  Correlation  Among  Variables 

As  developed  in  the  previous  section,  each  job  selected  for  use  in 
a  performance  evaluation  study  can  be  represented  by  a  vector  X  =  (X-j, 
^2'  •••*  xn)>  where  the  value  of  Xj  represents  the  magnitude  of  the 
demand  for  the  j—  resource.  If  there  are  m  jobs  in  the  selected  sub¬ 
set,  the  resource  demand  characteristics  for  the  subset  can  be  repre¬ 
sented  by  an  mxn  matrix 

X11 X1 2 ' ’ ’Xln 
X21 X22  * ' *X2n 

x.  x^ ...  x 

ml  m2  mn 

where  the  element  X..  represents  the  magnitude  of  the  demand  of  the  i— 

'  sj 

job  for  the  j—  resource. 


The  variables  (descriptors)  selected  to  measure  the  magnitude  of 
the  demands  for  resources  for  jobs  in  the  selected  subset  will  likely 
be  correlated  to  some  degree.  That  is,  there  is  a  degree  of  linear 
association  among  the  variables.  For  example,  it  may  be  noted  that 
jobs  which  print  many  lines  of  output  have  relatively  large  values  of 
I/O  time,  or  that  jobs  which  incur  a  high  degree  of  paging  issue  an 
inordinately  large  number  of  disc  I/O  requests. 

The  effect  of  intercorrelation  among  descriptor  variables  on  the 
resource  demand  pattern  of  the  workload  subset  can  easily  be  visual¬ 
ized  in  two  dimensions.  Let  and  X2  be  two  descriptor  variables, 
which  are  correlated  with  a  correlation  coefficient  r>0.  If  a  scatter 
plot  of  the  standardized  values  of  X-j  and  X2  is  constructed,  an 
elliptical  pattern  oriented  along  the  line  X2  =  rX^  will  result,  simi¬ 
lar  to  that  depicted  in  figure  4.1. 

Fig.  4.1  The  Effect  of  Correlated  Variables 


Intercorrelation  among  the  descriptor  variables  will  bias  clust¬ 
ering  results  obtained  when  jobs  are  clustered  by  similar  resource 
demands  [16].  The  effect  is  to  provide  a  weighting  for  the  common 
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characteristics  reflected  in  the  different  variables.  The  severity  of 
this  bias  is  difficult  to  assess  in  general,  since  it  is  somewhat  prob¬ 
lem  dependent.  That  is,  it  is  related  to  the  degree  of  intercorrelation, 
the  distance  metric  used,  and  the  weighting  scheme  supplied  by  the  ana¬ 
lyst. 

It  should  be  noted  that  high  degrees  of  correlation  do  not,  in 
general,  indicate  causal  relationships,  since  there  are  many  instances 
of  totally  unrelated  phenomena  which  exhibit  high  correlation.  However, 
if  two  highly  correlated  variables  are  included  in  the  descriptor  set, 
the  biasing  effect  will  be  the  same  whether  a  causal  relationship  exists 
or  not.  This  bias  may  not  be  undesirable,  but  it  should  be  considered 
since  it  may  help  to  explain  seemingly  contradictory  results  obtained 
in  the  clustering  phase. 

The  problem  of  intercorrelation  among  descriptor  variables  is 
avoided  if  only  uncorrelated  variables  are  included  in  the  descriptor 
set.  This,  however,  is  not  feasible  in  most  cases. 

Given  a  set  of  n  variables  which  are  intercorrelated,  it  is  pos¬ 
sible  to  construct  a  set  of  n  or  fewer  composite  variables  which  are 
linear  combinations  of  the  original  variables,  are  uncorrelated  and 
which  account  for  the  variance  in  the  data  [7],  This  can  be  accomp¬ 
lished  by  a  method  known  as  principal  components  [2,7,36,54,72,73,89]. 

Geometrically,  the  method  of  principal  components  involves  a  rota¬ 
tion  of  axes.  Each  of  the  resource  demand  variables  X-j ,  X£,  ...,  Xp 

A 

is  represented  by  a  coordinate  axis  from  the  origin  0  =  (0,  0,  . . . ,  0) . 

i.  L 

These  n  axes  form  an  n-dimensional  space,  with  the  i—  job  represented 
by  a  point  whose  coordinates  are  X-j  =  X.-j,  X2  =  X^»  •••>  *n  = 


In  principal  component  analysis,  the  aim  is  to  find  a  rotation  of  the 
axes  so  that  the  variable  Y1  represented  by  the  first  of  the  new  axes 
has  maximum  variance.  The  variable  Y2  represented  by  the  second  of  the 
new  axes  is  uncorrelated  with  Y^  and  has  maximum  variance  under  this 
restriction.  Similarly,  the  variable  Y^  represented  by  the  k—  new 
axis  is  uncorrelated  with  Y^ ,  Y2»  ...»  Y^_^,  and  has  maximum  variance 
under  these  restrictions  [2].  The  two  variable  case  is  illustrated 
in  the  following  figure,  where  the  "dots"  represent  the  various  jobs 
in  the  standardized  resource  demand  descriptor  space. 

Fig.  4.2  Principal  Components  for  n  =  2. 
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Computationally,  principal  component  analysis  involves  finding  the 
eigenvalues  of  the  correlation  matrix  of  X,  choosing  the  eigenvectors 
corresponding  to  the  nonzero  eigenvalues  orthonormal  to  each  other, 
and  postmultiplying  the  data  matrix  t  by  the  matrix  of  eigenvectors. 

The  details  of  this  procedure  are  given  in  Appendix  A. 


The  matrix  Y  which  is  produced  by  principal  component  analysis 
represents  the  scaled  resource  demand  vectors  of  the  workload  subset 


with  relation  to  the  orthogonal  principal  axes.  The  orthogonality 

insures  that  the  new  variables  ,  Y2 . Yr  are  uncorrelated,  hence 

clustering  can  proceed  free  of  the  biasing  effect  caused  by  the  inter¬ 
correlations  among  the  original  variables.  An  additional  advantage  in 
possible  reduction  of  the  dimension  of  the  feature  space  is  gained  by 
using  this  procedure,  as  will  be  discussed  in  the  next  section. 

4 . 4  Reducing  the  Dimension  of  the  Feature  Space 

If  n  resource  descriptor  variables  Xp  X2>  , Xp  are  used  to 
describe  the  demand  placed  on  system  resources,  each  job  will  be  repre¬ 
sented  by  a  point  in  n-dimensional  space.  Prior  to  clustering  jobs 
based  upon  similarity  of  resource  demands,  it  may  be  advantageous  to 
investigate  the  possibility  of  representing  each  job  in  a  space  of 
fewer  dimensions.  That  is,  it  may  be  possible  to  depict  the  salient 
features  of  the  resource  demand  patterns  with  k<n  descriptor  variables. 
This  is  desirable  from  a  computational  standpoint,  since  the  computa¬ 
tional  complexity  of  clustering  is  related  to  the  number  of  descriptor 
variables  as  well  as  the  number  of  data  units  (jobs  in  the  workload 
subset) . 

The  problem  of  reducing  the  dimension  of  the  feature  space  has 
been  examined  in  at  least  two  workload  characterization  studies  [3, 

66],  with  somewhat  contradictory  results.  Both  studies  approached  the 
problem  in  much  the  same  way.  The  scaled  resource  demand  matrix  was 
first  input  to  a  clustering  algorithm  with  ell  variables  present  to 
achieve  a  "true"  partition  of  the  workload.  A  single  resource  descrip¬ 
tor  was  then  removed,  and  the  data  matrix  reclustered.  This  was 
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repeated,  until  all  distinct  sets  of  n-1  descriptors  had  been  examined. 
The  process  then  was  applied  to  descriptor  sets  of  size  n-2,  then  n-3, 
and  so  on.  The  clustering  performance  for  each  set  of  descriptors  was 
measured  by  examining  the  number  of  intercluster  "migrations"  as 
compared  to  the  "true"  partition.  One  study  [66]  reported  promising 
experimental  results  using  this  procedure,  while  the  other  [3]  down¬ 
played  its  usefulness.  This  seeming  contradiction  of  results  is 
probably  due  to  the  differences  in  the  two  selected  descriptor  sets, 
and  the  different  degrees  of  intercorrelations  among  those  features 
reflected  in  the  workload  data.  That  is,  if  a  descriptor  variable 
which  is  highly  correlated  with  another  variable  is  removed  from  the 
descriptor  set,  its  exclusion  will  likely  cause  fewer  perturbations  in 
the  "true”  partition  than  if  a  variable  which  is  essentially  uncorre¬ 
lated  with  other  variables  is  excluded.  This  again  follows  from  the 
fact  that  correlated  variables  are,  to  some  degree,  reflecting  the 
same  characteristic  of  the  workload. 

Even  if  the  above  feature  reduction  algorithm  proves  useful  in 
reducing  the  dimension  of  the  feature  space,  it  suffers  from  a  fatal 
flaw.  As  previously  stated,  the  aim  of  reducing  the  dimension  of  the 
feature  space  is  to  reduce  the  number  of  computations  in  the  clustering 
stage  of  analysis.  Since  there  is  no  a  priori  indication  as  to  the 
relative  worth  of  each  descriptor  in  describing  the  "true"  partition, 
one  must  cluster  using  all  of  the  descriptors,  and  then  iteratively 
reduce  the  dimension  of  the  space.  Thus,  any  computational  advantage  is 
lost.  This  problem  is  overcome  to  a  certain  degree  if  clustering  is 


applied  to  the  principal  component  scores  rather  than  the  scaled  vari¬ 
ate  scores. 

Aside  from  the  fact  that  its  application  produces  uncorrelated 
variables,  principal  component  analysis  also  is  useful  due  to  its 
maximum  variance  properties.  The  first  principal  component  has  the 
largest  variance  of  any  linear  combination  of  the  variables  represented 
in  the  resource  demand  matrix;  the  second  principal  component  has  the 
largest  variance  of  any  linear  combination  orthogonal  to  the  first 
principal  component;  the  third  principal  component  has  the  largest 
variance  of  any  linear  combination  orthogonal  to  the  first  two,  etc. 
This  leads  to  a  valuable  property  of  principal  components,  namely  that 
the  best  least  squares  fit  of  the  original  space  of  n  dimensions  in  a 
space  of  k<n  dimensions  is  achieved  by  using  the  first  k  principal 
components  [7j.  Thus,  although  to  achieve  a  perfect  fit,  all 
of  the  principal  components  must  be  retained,  if  the  analyst  is  satis¬ 
fied  with  representing  only  a  portion  (say  95%)  of  the  variability, 
a  significant  reduction  in  the  dimensionality  of  the  problem  may  be 
possible. 

Information  on  the  proportion  of  the  total  variability  of  the  data 
matrix  explained  by  the  first  k<n  principal  components  is  available 
without  recourse  to  clustering.  That  is,  it  is  a  normal  byproduct  of 
principal  component  analysis.  This  measure  is 

A,  +  A2  +  ...  t  L 

p  '  V,  *\z  T  —  ~fn  •  "here  xr  x2 . \  are 

the  eigenvalues  of  the  correlation  matrix  arranged  in  decreasing  order. 
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Thus,  by  including  enough  components  so  that  this  ratio  is  at  least  as 
great  as  the  minimum  acceptable  value,  one  can  effectively  reduce 
the  dimension  of  the  descriptor  space  and  hence  reduce  the  computational 
requirements  in  the  clustering  phase.  It  should  be  noted  that  this 
reduction  of  dimension  is  merely  a  reduction  in  presentation  [72]. 

That  is,  measures  on  each  of  the  original  variables  mu-c  still  be 
taken  since  each  may  appear  in  the  expression  for  a  component  variable. 
The  aim  of  reducing  the  computational  requirements  in  later  phases  is 
accomplished  however. 

4.5  Cl ustering  Alqori thms 

Each  job  in  the  workload  subset  can  be  represented  as  an  n-dimen- 
sional  resource  demand  vector  X  =  (X-j ,  X^,  ....  Xn)  where  the  Xi  are 

i.  L. 

the  magnitude  of  the  demand  for  the  i—  resource.  Following  scaling 

and  principal  component  analysis,  each  job  is  represented  as  a  k-dimen- 

•* 

sional  vector  Y  =  (Y^,  Y^,  ....  Y^)  in  the  principal  components  space. 
The  next,  and  final,  step  in  the  analysis  process  is  to  cluster  the 
jobs  by  similar  resource  demands,  thus  achieving  a  partition  of  the 
workload  subset. 

Prior  to  application  of  a  clustering  algorithm,  the  analyst  must 
decide  upon  a  measure  of  distance.  That  is,  a  measure  must  be  selected 
which  gives  an  indication  of  how  "close"  two  jobs  are  with  respect  to 
their  resource  demands.  A  number  of  such  distance  measures  are  present 
in  the  literature.  Probably  the  most  commonly  applied  is  the  Euclidean 
measure  given  by 

°<VV  =[£,  (vj,i  - 
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where  Y-  is  the  standardized  resource  demand  vector  for  the  j—  job 

X 

in  the  principal  components  space,  and  Y^  is  the  similar  vector  for  the 
J£—  job. 

Another  consideration  is  the  appropriate  weight  the  analyst  wishes 
to  apply  to  each  of  the  descriptors.  That  is,  the  analyst  may  wish  to 
influence  the  clustering  algorithm  so  that  similarities  in  one  dimension 
carry  greater  weight  than  similarities  in  another  dimension.  The 


weight  W.  for  the  i—  descriptor  is  normally  incorporated  into  the 
distance  calculation  as 


V 


‘k 

T, 

i  =  l 
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Once  the  analyst  has  decided  upon  a  distance  measure  and  a  weight¬ 
ing  scheme,  there  are  two  general  clustering  schemes  which  may  be  used: 
hierarchical  and  non-hierarchical  clustering  [36], 

The  hierarchical  scheme  initially  views  the  collection  of  m  jobs 
as  m  separate  clusters  of  one  member  each.  A  similarity  measure  is 
calculated  between  each  pair  of  jobs,  and  those  two  jobs  which  are 
most  similar  are  joined  to  form  a  cluster  of  two  jobs.  This  cluster 
is  generally  represented  by  the  average  (centroid)  vector  of  the  two. 
This  process  is  continued,  with  the  two  "closest"  clusters  joined  at 
each  step  until  the  space  is  viewed  as  a  single  cluster  with  m  elements. 
The  analyst  can  halt  the  process  at  any  time,  thus  achieving  a  parti¬ 
tion  with  as  many  clusters  as  desired.  This  type  of  clustering  scheme 
is  typified  by  the  algorithm  proposed  by  Johnson  [50]. 

Non-hierarchical  clustering  requires  achieving  an  initial  partition 


of  the  data  set.  There  are  a  number  of  ways  of  achieving  this  initial 
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partition  [7].  These  include  taking  the  first  k  jobs  as  cluster 
centroids,  selecting  some  k  jobs  at  random  from  the  set  as  centroids, 
and  taking  a  partition  achieved  by  hierarchical  clustering  as  the 
initial  partition  [7].  Once  the  initial  partition  is  achieved,  it 
is  refined  by  comparing  all  jobs  with  the  cluster  centroids,  and  group¬ 
ing  those  jobs  with  the  "closest"  cluster.  The  major  differences 
in  the  various  non-hierarchical  schemes  involve  how  and  when  the  cluster 
centroids  are  updated  and  how  many  passes  are  made  through  the  data. 
Hon-hierarchical  clustering  schemes  are  typified  by  the  k-means  approach 
of  MacQueen  as  described  by  Anderberg  [7]. 

The  decision  as  to  which  clustering  algorithm  to  use  is  largely 
problem  dependent.  Hierarchical  schemes  generally  provide  more  insight 
into  the  problem,  since  a  wide  range  of  partition  sizes  (number  of 
clusters)  can  be  examined  with  a  single  application  of  the  algorithm. 
This;  however,  is  counterbalanced  by  the  fact  that  the  non-hierarchical 
algorithms  are  more  economical  to  use  computationally,  since  they  do 
not  require  the  repeated  calculation  of  similarity  measures  between 
each  pair  of  data  units  [7],  Since  the  size  of  the  workload  subset 
is  generally  quite  large  (i.e.  750  jobs  with  7  descriptor  variables 
in  one  study  [66];  1342  jobs  with  11  descriptor  variables  in  another 
[3])  the  insight  gained  through  the  use  of  hierarchical  clustering  is 
likely  not  worth  the  additional  computational  overhead  incurred. 

Repeated  application  of  a  non-hierarchical  clustering  scheme  such  as 
one  of  the  "nearest-centroid"  algorithms  detailed  in  Anderberg  [7]  will 
provide  the  needed  insight  at  less  cost  in  terms  of  computer  time. 

The  bias  caused  by  intercorrelated  descriptor  variables  is 
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exposed  through  principal  components  analysis,  however  it  is  not 
eliminated.  If  an  unweighted  Euclidean  distance  measure  is  applied, 
hyperspherical  clusters  will  be  formed.  Since  expressing  the  resource 
demand  vectors  with  respect  to  the  principal  components  effects  a 
simple  rotation  of  the  axes,  clustering  results  using  the  unweighted 
Euclidean  distance  measure  will  be  invariant  under  principal  components 
analysis.  That  is,  the  same  partition  of  the  workload  subset  will 
result  whether  clustering  with  respect  to  the  standardized  variable 
scores  or  with  respect  to  the  principal  component  scores.  This 
situation  is  illustrated  for  the  two  variable  case  in  figure  4.3. 

Fig.  4.3  Application  of  an  Unweighted  Euclidean  Distance  Measure 
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The  bias  caused  by  the  correlation  between  the  variables  X-j  and 
X2  is  apparent  in  figure  4.3  by  the  "band"  of  data  points  in  the  cluster. 
Thus,  the  intracluster  variance  will  be  greater  in  the  direction  of 
correlation  (V-|)  than  in  the  direction  orthogonal  to  it  (Y,,).  A 
weighting  scheme  is  needed  to  equalize  (or  nearly  so)  the  intracluster 
variations  in  both  directions. 

Application  of  a  weighting  function  causes  the  formation  of  hyper- 
elliptic  clusters  [7],  with  the  axes  of  the  ellipsoids  oriented  along 
the  variable  axes.  If  a  weighting  scheme  could  be  devised  so  that 
the  intracluster  variations  in  all  directions  are  approximately  the 
same,  the  biasing  effect  would  essentially  be  neutralized. 

If  the  data  is  subjected  to  principal  components  analysis,  a 
measure  of  the  variation  along  each  of  the  component  axes  is  available. 
That  is,  Var  (Y,)  =  A..  Intuitively,  a  weighting  function  W.  which 

J  J  * 

is  related  to  A.  would  appear  desirable.  Such  a  weighting  scheme 
would  weight  the  component  variables  in  proportion  to  the  variability 
that  they  "explain". 

Suppose  that  the  weighting  function  W..  =  1/A^  is  applied  to  the 
component  scores.  This  weighting  function  has  precisely  the  same 
effect  as  standardizing  the  principal  components  and  then  clustering 
using  an  unweighted  distance  function.  The  effect  of  such  a  weighting 
scheme  is  illustrated  in  figure  4.4  for  the  two  variable  case,  where 
it  is  assumed  that 


56 


Fig.  4.4  Effect  of  Improper  Weighting 


It  can  be  seen  from  figure  4.4  that  such  a  weighting  scheme 
merely  reinforces  the  bias  rather  than  neutralizing  it.  That  is,  the 
intracluster  variation  in  the  direction  of  Y1  is  still  greater  than 
that  in  the  direction  of  even  more  so  than  ^  an  unweighted 
distance  measure  were  used.  This  type  weighting  then  is  not  likely 
to  improve  the  clustering  results. 

Suppose  that  a  weighting  function  W^  =  were  applied,  where 
A •  > 1 .  This  should  result  in  the  formation  of  elliptic  clusters  whose 
major  axes  are  orthogonal  to  those  illustrated  in  figure  4.4.  This 
weighting  scheme  is  illustrated  in  figure  4.5  for  the  two  variable  case. 

This  weighting  is  seen  to  have  the  proper  effect.  That  is,  the 
intracluster  variation  in  both  directions  are  the  same  or  nearly  the 


same. 
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Fig  4.5  Effect  of  Proper  Weighting 


4.6  Summary 

A  statistical  methodology  has  been  proposed  to  aid  in  the  analysis 
and  summarization  of  the  workload  subset  selected  for  use  in  an  evalu¬ 
ation  study.  The  major  elements  of  this  methodology  are: 

(a)  Scaling  of  the  data  to  commensurable  ranges.  A  number  of 
schemes  are  available  to  accomplish  this,  however,  the  standardization 
of  all  variables  to  mean  0,  variance  1  offers  some  advantages. 

(b)  Applying  principal  components  analysis  to  achieve  uncorrelated 
variables  and  allow  selection  of  some  k<n  of  the  resource  variables 
which  account  for  the  major  part  of  the  variance  in  the  data. 

(c)  Applying  a  suitable  clustering  algorithm  to  associate 
"similar"  jobs  in  the  principal  components  space.  A  non-hierarchical 
scheme  using  a  weighted  distance  metric  appears  the  most  promising. 
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An  example  of  the  application  of  this  methodology  to  real  workload 
data  appears  in  the  case  study  in  Chapter  VI. 
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CHAPTER  V 

CONSTRUCTING  THE  TEST  WORKLOAD 


5.1  Introduction 

The  output  of  the  analysis  phase  will  be  a  summarized  form  of  the 
real  workload  subset.  The  jobs  making  up  the  subset  are  grouped  accord¬ 
ing  to  similar  resource  demands.  Each  "cluster"  of  similar  jobs  is 
represented  by  the  cluster  centroid  and  a  cluster  membership  list.  Each 
of  these  clusters  can  be  further  analyzed  by  constructing  distribution 
functions  for  each  represented  descriptor  variable.  This  type  of 
analysis  would  yield  a  workload  model  which  could  be  used  in  analytic/ 
simulation  studies.  Appropriate  sampling  techniques  could  be  used  to 
extract  a  test  workload  from  such  a  model.  Empirical  studies,  on 
the  other  hand,  require  that  executable  test  workloads  be  constructed. 
Thus,  the  construction  of  distribution  functions  and  sampling  techniques 
will  not  yield  a  useful  test  workload  for  such  studies. 

A  number  of  different  types  of  executable  test  workloads  were 
surveyed  in  Chapter  II.  These  included  benchmarks ,  instruction  mixes, 
standard  jobs ,  and  synthetic  jobs.  Synthetic  jobs  offer  advantages  in 
in  the  areas  of  flexibility  and  portability  over  instruction  mixes  and 
standard  jobs.  They  also  avoid  the  security  and  privacy  problems 
associated  with  using  real  jobs  (benchmarks).  A  test  workload  composed 
of  synthetic  jobs,  then,  is  likely  to  be  the  most  useful  form  of  an 
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executable  test  workload. 


One  of  the  primary  criteria  applied  in  assessing  the  usefulness 
of  a  test  workload  is  how  accurately  it  reflects  the  resource  demands  of 
the  real  workload  which  spawned  it.  A  test  workload  which  accurately 
reflects  the  characteristics  of  the  real  workload  is  said  to  be 
representative.  Constructing  a  representative  test  workload  using  syn¬ 
thetic  jobs  requires  careful  design  of  the  jobs  making  up  the  mix.  Some 
of  the  techniques  and  procedures  useful  in  designing  synthetic  jobs 
will  be  surveyed  in  this  chapter.  Most  of  the  techniques  surveyed  are 
oriented  toward  test  workloads  constructed  for  a  batch  processing 
installation.  Similar  considerations  apply  to  transactions  in  a  time¬ 
sharing  environment,  however  the  general  form  of  the  model  is  different. 
The  actions  which  must  be  emulated  in  an  interactive  session  include 
user  log-on,  program  creation,  editing,  program  compilation,  program 
execution,  and  user  log-off.  A  model  embodying  such  actions  can 
more  realistically  be  referred  to  as  an  interactive  script  [32]  rather 
than  a  synthetic  job. 

5.2  General  Considerations  in  the  Design  of  Synthetic  Jobs 

A  synthetic  job  is  a  parametric  program  in  which  the  demands  placed 
upon  the  various  system  resources  are  controlled  by  the  values  assumed 
by  various  input  variables  (parameters)  [32].  This  relationship  to 
the  actual  resource  utilization  requires  the  programmer  to  approach 
the  design  of  synthetic  jobs  from  a  different  viewpoint  than  normal 
programming  problems.  Normal  programming  projects  are  usually  under¬ 
taken  for  a  particular  reason.  That  is,  the  user  wants  the  computer 
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to  perform  a  particular  task.  The  task  to  be  performed  is  the  over¬ 
riding  consideration  in  program  development.  There  may  be  an  attempt 
to  minimize  the  resources  used  in  an  effort  to  hold  down  the  cost  of  the 
project,  but  this  is  generally  a  secondary  cons  ^ration.  Synthetic 
jobs,  on  the  other  hand,  are  independent  of  the  task  which  is  performed. 
They  are  also  independent  of  any  input  data  or  data  files  accessed  by 
the  real  programs  they  are  designed  to  emulate.  The  sole  consideration 
in  their  design  is  that  they  use  the  same  amount  and  types  of  resources 
that  their  real  counterparts  use.  Thus,  a  somewhat  arbitrary  "compute 
loop"  can  be  used  to  force  the  synthetic  job  to  consume  a  particular 
amount  of  CPU  time.  I/O  activity  by  real  jobs  can  be  emulated  by 
having  the  synthetic  job  access  arbitrary  files  of  the  required  type 
(i.e.  tape,  disk,  or  card).  These  files  can  be  "garbage  files" 
expressly  constructed  for  this  purpose,  or  any  other  file  to  which  the 
analyst  has  access.  Thus,  there  is  no  unique  synthetic  job  for  each 
situation.  A  multitude  of  logically  different  programs  can  be  forced 
to  exhibit  the  same  resource  demand  patterns  with  the  proper  choice 
and  setting  of  parameters. 

The  degree  of  complexity  of  a  synthetic  job  is  generally  determined 
by  the  level  of  detail  used  in  characterizing  the  real  workload.  If 
a  limited  resource  descriptor  set  is  used,  a  relatively  simple  synthe¬ 
tic  job  will  normally  suffice.  If,  on  the  other  hand,  an  expanded 
resource  descriptor  set  is  used  which  reflects  more  minute  aspects  of 
the  real  job's  resource  utilization,  a  more  complex  synthetic  job  will 
generally  be  required.  Ferrari  [32]  illustrated  this  point  with  two 
examples. 
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The  first  example  given  by  Ferrari  [32]  concerns  construction  of 
a  test  workload  for  a  batch  processing  installation.  Jobs  in  the 
workload  were  characterized  by  the  descriptor  pair  (t  ,  n.jo).  The 
first  descriptor  gives  the  CPU  time  required  by  the  job  while  the 
second  gives  the  number  of  I/O  operations  initiated  by  the  job.  Since 
the  type  of  I/O  is  not  specified,  it  can  be  assumed  to  be  simple  "reads" 
from  cards  and  "writes"  to  a  printer  (or  any  other  mode  for  that  matter) 
in  an  arbitrary  proportion.  A  synthetic  job  designed  to  emulate  such 
jobs  can  be  composed  of  a  simple  loop.  I/O  is  performed  a  certain 
proportion  of  the  iterations  through  the  loop,  and  some  arbitrary 
computation  performed  some  other  (or  perhaps  the  same)  proportion  of 
the  times  through  the  loop.  The  loop  is  executed  until  the  required 
number  of  I/O  operations  are  performed  and  the  proper  amount  of  CPU 
time  is  accrued.  An  example  of  such  a  synthetic  job  and  a  situation 
in  which  this  low  level  of  detail  is  sufficient  is  given  in  the  case 
study  in  Chapter  VI. 

More  complex  synthetic  jobs  are  typified  by  the  one  developed  and 
tested  by  Buchholz  [18].  This  job  is  designed  to  emulate  a  file 
processing  action.  There  are  three  parameters  used,  which  specify  the 
number  of  master  records  read  in,  the  number  of  detail  (transaction) 
records  processed,  and  the  number  of  times  the  "compute"  loop  is 
executed.  This  job  can  be  used  to  emulate  the  resource  demands  of 
jobs  whose  resource  descriptor  set  is  somewhat  expanded  over  the 
earlier  one  described.  An  example  of  the  use  of  such  a  synthetic  job 
is  also  given  in  the  case  study  of  Chapter  VI. 
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5 . 3  Parameterization  of  Synthetic  Jobs 

The  parameters  of  a  synthetic  job  allow  the  individual  system 
resource  demands  to  be  easily  modified.  In  general,  greater  flexibility 
requires  more  parameters,  while  simplicity  and  economy  dictate  that 
the  number  of  such  parameters  be  kept  to  a  minimum.  In  the  final 
analysis,  it  is  the  level  of  detail  used  in  characterizing  the  real 
workload  which  determines  the  number  of  parameters  to  use.  This  re¬ 
quired  level  of  detail  is  in  turn  determined  by  the  resolution  necessary 
in  the  evaluation  study.  For  example,  consider  a  test  workload  com¬ 
posed  of  synthetic  jobs  where  each  synthetic  job  has  parameters  to 
specify  memory  size  and  total  CPU  processing  time.  This  workload 
might  be  sufficient  if  the  aim  of  the  evaluation  study  is  to  determine 
the  effects  of  altering  main  memory  on  CPU  utilization.  It  would 
not  provide  the  required  resolution  if  the  aim  of  the  study  is  to  deter¬ 
mine  the  effects  of  differing  amounts  of  I/O  processing  on  CPU  and 
I/O  overlap.  In  fact  this  latter  study  would  require  at  least  one 
parameter  to  allow  the  ratio  of  CPU  processing  to  I/O  processing  to  be 
altered.  It  may  also  be  necessary  to  include  resource  descriptor 
variables  which  specify  the  duration  and  relative  timing  of  I/O 
requests.  Thus,  there  is  a  three-way  dependence  among  the  performance 
measures  observed  in  the  study,  the  descriptor  variables  used  to 
characterize  jobs  in  the  workload,  and  the  synthetic  job  parameters 
used  to  control  the  demands  placed  on  various  system  resources. 

More  formally,  suppose  that  a  test  workload  is  constructed  for 
use  in  an  evaluation  study  in  which  the  I  performance  variables  ,  V^, 
...,  V  are  to  be  observed.  Suppose  further  that  these  performance 
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variables  are  functions  of  m  system  resources  described  by  the  descrip¬ 
tor  variables  r-j ,  r ....  r  ,  and  that  the  values  assumed  by  these 

descriptor  variables  are  determined  by  n  user  parameters  p  ,  p  , 

1  2 

....  p  .  The  relations  existing  among  the  variables  can  be  expressed 
as 


V1  =  VrT  rm)  =  Vn  [r^Pp  ....  pn),  ....  rm(Pl,  ....  pj] 


m'  “1  L'1VH 

V-,  (p, ,  ....  p  ) 


V  2  ^  c.^r] '  rm^  V2  ^1^1 . pn^  ’  rm^pr  pn^ 

•  =  V9(Pi ,  . . . ,  p  ) 


\  =  Vrl*  •••*  rm}  =  \  ^l(pl . pn}*  •••’  rm(pl’  ••••  pn)] 


=  V£(P]»  •••»  Pn)* 

The  relations  can  be  summarized  in  more  compact  vector  notation  as 
V  =  V  (r)  =  V  [?  (p)]  =  V  (p).  Now,  recognizing  that  the  values 
assumed  by  the  parameters  p^ ,  p^  completely  determine  W^,  the 

composite  relation  =  V^W^.),  i  =  1  (or  V  =  V  (W^.)  ) 

results,  where  =  Wt  (p^,  ...,  pn). 

One  problem  which  must  be  solved  in  constructing  is  determining 
the  relationship  which  exists  between  the  resource  descriptor 
variables  r^ ,  rm  and  the  synthetic  job  parameters  p-j ,  ...»  Pn- 
The  parameters  p1 ,  . . . ,  pn  can  be  assumed  independent  of  one  another, 
and  in  some  cases  they  may  bear  a  simple  linear  relationship  to  the 
r^'s.  This  relationship  can  be  established  by  observing  the  r. 's 
for  a  few  runs  of  the  synthetic  job  with  varying  p^'s,  and  applying 
regression  analysis  [28].  The  linear  form  of  the  relationships 
r-  =  ri(P-|>  •••»  Pn)i  i  =  1»  2,  ....  m,  allows  inversion  to  give  rela¬ 
tionships  of  the  form  p^.  =  Pj(ri’  rm) ,  j  =  1,  ...,n.  This  assumes 
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n  .  m  and  that  the  original  system  is  non-singular.  These  latter 
relations  can  be  used  to  determine  the  appropriate  parameter  settings 
to  produce  a  given  resource  demand  pattern. 

Examples  of  the  use  of  linear  regression  in  establishing  the 
relationships  which  exist  between  the  resource  descriptor  variables 
r-j ,  r^,  ....  rm  and  the  synthetic  job  parameters  p-j ,  p £ .  . Pn  are 
given  in  the  case  study  of  Chapter  VI.  It  should  be  noted  that  the 
simple  form  of  these  relations  does  not  suggest  that  similar  simple 
relationships  exist  between  the  performance  variables  V-j ,  V^,  ....  Vp 
and  the  resource  descriptor  variables  r-| ,  r^,  ....  rm>  Establishing 
this  relationship  must  be  accomplished  during  the  evaluation  study 
itself. 

5.4  Controlling  the  Demand  for  System  Resources 

A  procedure  for  establishing  the  relationship  between  the  resource 
descriptor  variables  r-j ,  r^,  ....  rm  and  the  synthetic  job  parameters 
p-| ,  P2>  ....  Pn  was  suggested  in  the  previous  section.  This  procedure 
assumes  that  parameters  which  are  likely  to  affect  the  job's  demand 
for  a  given  resource  have  been  established  and  incorporated  into 
the  design  of  the  synthetic  job.  Some  of  the  ways  in  which  the  demands 
placed  upon  system  resources  can  be  controlled  are  surveyed  in  this 
section. 

One  of  the  major  system  resources  is  main  memory.  The  amount  of 
main  memory  used  by  a  given  job  is  obviously  related  to  the  size  of  the 
program  as  well  as  the  space  needed  for  system  routines  supporting 
the  job's  execution.  A  job's  main  memory  requirements  can  thus  be 
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altered  by  modifying  the  size  of  arrays  or  by  including  routines 
which  may  never  be  called.  A  number  of  systems  (i.e.  IBM)  enforce 
a  policy  known  as  "preallocation  of  resources"  to  preclude  deadlock 
problems  [23].  The  maximum  amount  of  main  memory  likely  to  be  used 
by  the  job  must  be  requested  in  advance  of  its  initiation.  If  this 
requested  amount  is  not  sufficient  to  allow  program  execution,  the 
job  is  terminated.  The  size  of  the  region  in  main  memory  allocated  to 
a  particular  program,  if  such  a  strategy  is  employed,  can  be  either 
increased  or  decreased  by  altering  the  region  request  field  in  the 
job  control  statements. 

Control  of  the  amount  of  CPU  processing  time  used  by  a  program  is 
possible  by  including  a  "compute-loop"  control  parameter.  An  arbitrary 
sequence  of  computations  is  performed  iteratively  until  the  desired 
CPU  time  is  accrued.  The  required  number  of  iterations  through  the 
loop  can  be  controlled  precisely  through  access  to  system  timers  [32]. 
It  can  alternately  be  established  in  advance  through  calibration 
experiments.  The  amount  of  processing  time  accrued  by  a  particular  job 
is  related  to  factors  ether  than  simply  the  number  of  computations  per¬ 
formed.  The  number  of  I/O  activities  initiated,  for  example,  can  have 
a  significant  impact  on  CPU  time  used. 

Control  of  the  I/O  processing  requirements  of  a  job  is  more 
difficult  than  either  main  memory  or  CPU  time.  There  are  a  multitude 
of  different  types  of  I/O.  It  may  be  necessary  to  control  each  of 
them,  depending  upon  the  resolution  needed  in  the  study.  Unit  record 
I/O  (i.e.  cards  read,  lines  printed,  and  cards  punched)  is  the  easiest 
to  control.  The  number  of  cards  read  is  obviously  a  direct  function 
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of  the  size  of  the  program.  It  can  be  varied,  within  certain  limits, 
by  including  or  excluding  comment  and  data  cards.  The  number  of  lines 
printed  (or  cards  punched)  can  be  controlled  through  inclusion  of  a 
"print"  (or  "punch")  loop.  This  loop  is  executed  a  sufficient  num¬ 
ber  of  times  to  produce  the  desired  output.  Tape  and  disk  (or  drum) 

I/O  is  controlled  by  creating  files  which  are  accessed  using  the 
proper  mode.  Records  can  be  read,  modified,  and  written  under  the 
control  of  a  file  processing  loop.  There  is  a  potential  problem  in 
accurately  reflecting  the  real  workload's  processing  behavior.  This 
results  from  the  fact  that  in  addition  to  controlling  the  number  of 
I/O  activities  initiated,  the  size  of  the  data  block  transferred 
each  time  must  also  be  specified.  Data  on  the  real  workload's  resource 
demands  is  generally  not  available  at  the  required  level  of  detail 
from  system  accounting  logs.  It  can  be  obtained  by  using  a  monitor, 
as  was  mentioned  in  Chapter  III. 

Another  type  of  I/O  activity  which  must  be  controlled  in  virtual 
memory  systems  is  paging  I/O.  In  a  demand  paging  environment,  blocks 
of  data  are  transferred  from  auxiliary  storage  into  main  memory  as 
required.  If  main  memory  is  full,  some  "pages"  may  have  to  be  recopied 
back  to  auxiliary  storage  to  make  room  for  the  next  "page"  copied 
into  main  memory.  Paging  activity  can  be  controlled  to  a  certain 
extent  by  careful  program  development.  Techniques  useful  in  improving 
the  locality  of  a  program  and  thus  decreasing  its  expected  page  fault 
rate  are  discussed  by  Spirn  [85].  Paging  activity  is  also  highly 
environment  dependent.  Thus  any  significant  control  over  paging 
activity  will  likely  have  to  be  exerted  during  the  calibration/ 
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validation  phase  when  the  entire  test  workload  is  avai lable. 

Direct  control  can  be  exerted  over  many  of  the  system  resources 
through  inclusion  of  loop  control  parameters  and  proper  job  control 
statements.  An  example  of  the  use  of  parameters  to  control  the  various 
system  resources  is  included  in  the  case  study  of  Chapter  VI. 

5 . 5  II he_  Design  of  Cal  ibration  Experiments 

It  is  necessary  once  a  synthetic  job  has  been  designed,  to 
establish  the  relationship  between  the  parameters  of  the  synthetic 
job  and  the  resource  descriptor  variables  used  to  characterize  jobs 
in  the  real  workload.  Such  a  process  can  be  termed  "calibrating"  the 
synthetic  job.  The  procedure  proposed  in  Section  5.5  requires  that 
the  synthetic  job  be  executed  on  the  system  for  various  parameter 
settings.  The  corresponding  values  of  the  descriptor  variables  are 
recorded  for  each  run,  and  regression  analysis  used  to  establish  the 
desired  relationship.  There  are  a  number  of  unanswered  questions 
associated  with  this  procedure.  These  include  how  many  runs  of  the 
synthetic  program  are  necessary  to  establish  an  accurate  relationship, 
what  parameter  settings  should  be  used  for  each  run,  and  how  to 
account  for  the  acknowledged  environmental  variations  (see  Chapter 
III)  in  the  resource  demands  from  one  run  to  the  next.  The  use  of 
statistical  experimental  design  techniques  is  proposed  in  this  section 
to  assist  in  answering  these  questions. 

The  magnitude  of  the  demands  placed  on  system  resources  by  a 
given  job  can  vary  from  one  run  to  the  next.  Some  of  the  demands  most 
susceptible  to  these  environmental  differences  are  CPU  processing 
time,  I/O  processing  time,  and  data  transfer  over  the  channels  handling 
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paging  activity.  This  variation  in  resource  demands  can  have  a  signi¬ 
ficant  effect  on  relationships  established  through  regression  analysis. 
Indiscriminant  running  of  the  synthetic  job  will  yield  data  in  which 
it  is  impossible  to  separate  the  effect  on  the  response  variable  due 
to  this  "chance"  variation  from  that  caused  by  the  setting  of  various 
parameter  levels. 

Most  of  the  parameters  used  in  controlling  the  magnitude  of  the 
demands  placed  upon  various  system  resources  by  a  synthetic  job  can 
assume  a  wide  range  of  values.  For  example,  the  number  of  times  a 
"compute"  loop  is  executed  is  constrained  only  to  be  a  non-negative 
integer.  Similar  restrictions  (or  lack  thereof)  apply  to  other  para¬ 
meters.  Failure  to  use  a  wide  enough  range  of  values  for  these 
parameters  wil I  yield  a  predictor  equation  which  cannot  be  used  in 
some  cases.  This  is  because  it  is  almost  never  feasible  to  extrapolate 
using  a  regression  equation  [28]. 

Related  to  the  setting  of  the  parameter  levels  for  each  run  of  the 
synthetic  job  is  the  required  number  of  runs.  The  synthetic  job 
could  be  run  a  large  number  of  times  (say  100)  with  the  parameters  set 
at  the  same  values.  This  obviously  would  yield  a  highly  reliable 
relationship  for  that  particular  combination  of  settings.  The  validity 
of  the  relationship  for  some  other  combination  of  parameter  settings 
would  be  highly  suspect. 

Problems  similar  to  those  outlined  above  are  commonly  encountered 
in  other  data  analysis  situations.  A  branch  of  statistics  known  as 
experimental  design  [43]  has  evolved  to  aid  in  the  resolution  of  these 
problems.  The  methodology  outlined  for  designing  factorial  experiments 
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[43]  appears  applicable  to  this  problem. 

A  factorial  experiment  is  one  in  which  all  levels  of  a  given  factor 
are  combined  with  all  levels  of  every  other  factor  of  the  experiment 
[43].  Each  of  the  synthetic  job  parameters  to  be  varied  can  be  consid¬ 
ered  as  a  factor  in  the  calibration  experiment.  Levels  for  each  factor 
can  be  established  which  are  likely  to  cover  the  required  range  of 
resource  demands.  Each  unique  combination  of  factor  levels  can  be 
thought  of  as  a  "treatment"  to  be  applied.  Treatments  are  assigned 
at  random  to  each  run  of  the  job. 

The  use  of  statistical  design  techniques  provides  a  number  of 
advantages  in  calibration  experiments.  They  include: 

(a)  The  randomization  of  the  treatment  to  run  assignment  minimizes 
the  effect  of  chance  environmental  variations  ih  resource  demands. 

(b)  For  a  given  number  of  factors  and  levels  per  factor,  one 
can  precisely  calculate  the  number  of  runs  necessary  for  a  complete 
replication  of  the  experiment.  For  example,  if  five  factors  are 
present,  and  each  can  assume  two  levels,  25  =  32  runs  are  required. 

The  analyst  can  reduce  the  number  of  runs  by  using  fractional  replica¬ 
tions.  This  involves  confounding  some  effects. 

(c)  The  significance  of  the  effects  on  the  resource  demands  by 
the  various  parameters  can  be  tested  through  an  analysis  of  variance. 
Interaction  effects  can  also  be  tested,  although  in  some  cases  it  is 
difficult  to  interpret  such  effects. 

(d)  Confidence  limits  can  be  established  for  the  obtained 
regression  coefficients. 
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It  costs  no  more  in  most  cases  to  conduct  a  carefully  designed 
experiment  than  it  does  a  poorly  designed  one.  The  use  of  statistical 
experimental  design  techniques  can  have  a  significant  impact  in  the 
calibration  phase.  An  application  of  these  techniques  is  given  in 
the  case  study  of  Chapter  VI. 

5 . 6  Val idating  the  Test  Workload 

The  calibration  experiments  discussed  in  the  previous  section 
can  be  used  to  establish  predictor  equations  relating  the  synthetic 
joh  parameters  to  the  resource  descriptor  variables.  A  synthetic  job 
mix  can  then  be  constructed  by  including  sufficient  copies  of  each 
of  the  synthetic  jobs  with  the  appropriate  parameter  settings.  It 
is  necessary  to  execute  this  synthetic  mix  on  the  system  being  studied 
and  to  determine  what  degree  of  representativeness  has  been  achieved. 
This  process  can  be  termed  validation. 

A  number  of  authors  [4,32,49,86]  have  emphasized  the  importance 
of  validating  test  workloads.  The  general  consensus  seems  to  be  that 
a  test  workload  which  has  not  been  validated  should  not  be  used.  The 
particular  subset  of  the  real  workload  which  is  used  as  a  model  in  the 
design  of  a  test  workload  is  selected  because  it  exhibits  some  charac¬ 
teristics  pertinent  to  the  evaluation  study  (i.e.  heavy  loading,  high 
paging  rate,  etc.).  If  the  test  workload  does  not  exhibit  the  same 
characteristics,  the  evaluation  study  can  be  severely  hampered. 

If  the  test  workload  does  not  accurately  reflect  the  resource 
demands  of  the  real  workload  subset,  it  is  likely  due  to 

(a)  errors  in  recording  the  resource  demands,  either  because  the 
recording  process  was  not  accurate  or  because  the  resource  demand 


pattern  was  distorted  {perhaps  due  to  artifacts  introduced  by  the  moni 
toring  process  itself), 

(b)  errors  introduced  when  the  actual  workload  demands  are 
reduced  to  probability  distributions  or  clusters,  or 

(c)  errors  in  computing  the  synthetic  job  parameters. 

Errors  of  the  first  and  second  type  are  common  to  nearly  all 

methods  of  generating  test  workloads.  They  can  be  precluded  only 
by  exercising  extreme  care  in  those  stages  of  the  construction  process 
Errors  of  the  third  type  are  unique  to  test  workloads  generated  using 
synthetic  jobs.  Careful  design  of  the  calibration  experiments  should 
minimize  the  possibility  of  an  error  of  this  type  occurring. 

An  obvious  means  of  verifying  the  accuracy  of  the  synthetic  job 
parameters  is  to  execute  the  test  workload,  record  the  demands  placed 
upon  the  system  resources,  and  then  compare  the  resulting  probability 
distributions  of  demand  clusters  with  those  produced  by  the  real  work¬ 
load.  A  number  of  statistical  tests  (i.e.  Chi-Square,  Kolmogorov- 
Smirnov)  are  available  for  testing  "goodness  of  fit".  Errors  of  the 
first  and  second  type  mentioned  above,  however,  could  go  undetected 
using  this  process.  The  monitoring  process  will  likely  introduce  the 
same  bias  when  the  test  workload  is  executed  as  it  did  during  proces¬ 
sing  of  the  actual  workload  subset.  The  same  analysis  package  will 
likely  be  used  to  summarize  both  the  resource  demands  of  the  actual 
workload  and  those  of  the  test  workload.  Thus,  the  same  errors  are 
apt  to  occur  in  both  analyses. 

The  validation  phase  of  test  workload  construction  is  probably 
the  least  understood  phase.  There  are  a  number  of  reasons  for  this. 
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Many  studies  never  progress  this  far,  since  it  is  the  last  phase  of 
the  process  (although  the  calibration  phase  may  be  reentered  if  a 
non-representative  test  workload  is  produced).  Secondly,  to  avoid 
distorting  the  demand  characteristics  of  the  test  workload,  it  must 
be  executed  in  isolation  from  other  jobs  on  the  system.  This  requires 
a  dedicated  system  during  that  period  of  time,  which  is  sometimes 
inconvenient  and  expensive. 

5.7  Summary 

A  test  workload  can  be  constructed  using  synthetic  jobs.  The 
parameters  to  incorporate  into  the  design  of  the  synthetic  jobs  are 
determined  by  the  resource  descriptor  variables  used  to  characterize 
the  real  workload.  These  descriptor  variables  are  in  turn  determined 
by  the  performance  variables  required  by  the  evaluation  study.  Regres¬ 
sion  analysis  can  be  used  to  establish  the  relationships  between  the 
synthetic  job  parameters  and  the  resource  descriptor  variables. 
Statistical  experimental  design  procedures  can  be  applied  to  assist 
in  the  design  of  these  calibration  experiments.  Following  the  design 
and  calibration  of  the  synthetic  jobs,  a  synthetic  mix  can  be  con¬ 
structed  by  including  the  appropriate  number  of  copies  of  each  synthe¬ 
tic  job  with  the  proper  parameter  settings.  This  test  workload  must 
be  executed  on  the  system,  and  its  resource  demands  compared  with  those 
of  the  real  workload.  This  latter  process  is  termed  validation. 


CHAPTER  VI 


CASE  STUDY 

6.1  Introduction 

A  methodology  for  constructing  a  test  workload  suitable  for  use 
in  a  performance  evaluation  study  has  been  developed  in  Chapters  III, 
IV,  and  V.  This  chapter  illustrates  this  methodology  with  a  case 
study  of  the  primary  computing  system  at  Texas  A&M  University. 

A  brief  description  of  the  present  system  configuration  begins 
the  study,  followed  by  a  description  of  the  system  workload  in  terms 
of  gross  workload  characteristics.  Succeeding  sections  illustrate 
the  application  of  techniques  to 

(a)  express  the  selected  workload  subset  as  a  resource  demand 
matrix; 

(b)  transform  this  demand  matrix  through  suitable  scaling  and 
principal  component  analysis; 

(c)  summarize  the  workload  subset  using  a  clustering  strategy; 

(d)  design  synthetic  jobs  to  replace  the  real  jobs  reflected  in 
the  selected  workload  subset. 


This  study  is  not  directed  toward  measuring  any  particular  aspect 
of  the  system's  behavior.  Rather,  its  aim  is  to  demonstrate  a  proce¬ 
dure  by  which  a  drive  workload  can  be  constructed.  For  this  reason, 
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there  is  a  degree  of  arbitrariness  in  some  aspects  of  the  study,  parti¬ 
cularly  in  the  workload  subset  which  was  selected.  The  selected 
subset  does  not  exhibit  any  particularly  outstanding  feature;  it  was 
selected  more  or  less  at  random.  In  an  actual  performance  evaluation 
study,  considerable  care  must  be  taken  in  selecting  a  workload  sub¬ 
set  which  provides  an  appropriate  environment  for  the  study. 

6.2  System  Description 

The  Texas  A&M  University  Computer  Network  is  a  centralized 
network  with  the  Amdahl  470/V6  at  its  hub.  Access  through  remote  job 
entry  (RJE)  is  possible  from  a  number  of  locations  throughout  Texas, 
including  Amarillo,  Austin,  Brenham,  Galveston,  Prairie  View,  Stephen- 
ville,  Temple,  Tyler,  Texarkana,  and  Waco.  In  addition,  four  remote 
computing  centers  are  dispersed  about  the  main  campus  of  Texas  A&M. 

The  Data  Processing  Center  (DPC),  which  operates  the  network,  acts  as 
a  centralized  data  processing  facility,  providing  data  processing 
services  in  support  of  the  academic,  research,  and  administrative 
functions  of  the  university. 

The  Amdahl  470/V6,  which  was  installed  in  late  1975,  is  the 
central  computer.  It  is  supplemented  by  various  mini/micro  computers 
which  assist  in  data  reduction  and  provide  an  opportunity  for  "hands- 
on"  instruction.  The  470/V6  is  presently  equipped  with  six  megabytes 
of  main  memory,  a  sixteen  kilobyte  cache  memory,  and  has  a  cycle  speed 
of  32.5  nanoseconds.  Sixteen  data  channels  (0-F)  are  provided.  These 
I/O  processors  are  currently  assigned  as  follows: 
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Channel  0  -  Unit  Record  I/O 

Channel  1  -  8  CALCOMP  3330  Mod  I  compatible  disk  drives 
Channel  2  -  Unit  Record  I/O 

Channel  3-12  CALCOMP  3330  Mod  II  compatible  disk  drives 

Channel  4  -  COMTEN  3670  communications  control  module 

Channel  5-12  CALCOMP  3340  compatible  tape  drives 

Channel  6  -  Alternate  to  channel  5 

Channel  7  -  Alternate  to  channel  3 

Channel  8-80  IBM  3270  CRT  terminals  (IMS) 

Channel  9  -  Not  utilized 

Channel  A  -  HASP  pseudo  devices  (disk) 

Channel  B  -  8  CALCOMP  3330  Mod  I  compatible  disk  drives 

Channel  C  -  Not  utilized 

Channel  D  -  Not  utilized 

Channel  E  -  Not  utilized 

Channel  F  -  Not  utilized 

The  system  is  presently  operating  under  SVS  Release  1.7,  in  a 
HASP  4.0  environment.  SVS  swaps  virtual  memory  between  the  disk  and 
real  memory  in  4096  byte  segments  (pages).  TSO,  the  Time  Sharing 
Option  of  IBM  operating  systems,  provides  a  time  sharing  environment 
in  which  most  functions  available  to  the  batch  programmer  are  made 
available  to  the  terminal  user.  Other  software  subsystems  available 
include 

(a)  APL-SV  -  A  time-sharing  system  provided  by  IBM  which  allows 
many  terminal  users  concurrent  access  to  the  470. 

(b)  IMS/VS  -  An  IBM  program  product  providing  data  base  and 
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data  communication  facilities. 

(c)  SYSTEM  2000  -  A  general  purpose  data  base  management  system 
developed  by  MR I  Systems  Corporation. 

(d)  MARK  IV  -  A  file  management  system  developed  by  Informatics, 

Inc. 

(e'  PANAVALET  -  A  program  management  and  security  system  developed 
by  Pansophics  System,  Inc. 

(f)  WYLBUR/370  -  A  text  editing  system  developed  at  Stanford 
University. 

A  wide  variety  of  language  translators  are  provided.  Those 
supported  by  the  DPC  include 

(a)  ASSEMBLER  G  -  Assembly  language, 

(b)  ASSEMBLER  X  -  Assembly  language, 

(c)  ASSIST  -  Fast  student  assembler, 

(d)  ANS  COBOL  (version  3)  -  Business  oriented  language, 

(e)  FORTRAN  H  (extended)  -  Scientifically  oriented  language, 

(f)  OS/VS  COBOL  -  Business  oriented  language, 

(g)  PL/C  -  Fast  PL/ I  compiler, 

(h)  PL/I  Optimizing  Compiler  -  General  programming  language, 

(i)  WATBOL  -  Fast  COBOL  compiler  and 

(j)  WATFIV  -  Fast  FORTRAN  compile1”. 

In  addition,  language  translators  for  ALGOL,  SNQBOL,  LISP,  PASCAL, 
and  RPG  are  available,  but  are  not  supported  by  the  DPC.  A  large 
number  of  application  packages  are  available,  including  GPSS,  CSMP  III, 
SSP,  SAS  76,  SPSS,  and  IMSL. 
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6.3  Workload  Description 

The  workload  of  the  Amdahl  470/V6  is  composed  of  five  general 
categories  of  worksteps,  where  in  this  case,  "workstep"  refers  to  an 
increment  of  the  workload.  This  increment  could  be  a  job  in  a  batch 
environment,  or  a  session  in  a  timesharing  environment.  These  cate¬ 
gories  are: 

(a)  Teaching  -  student  worksteps,  and  other  worksteps  run  in 
direct  support  of  teaching, 

(b)  Research  -  worksteps  related  to  research  projects, 

(c)  Administrative  -  worksteps  run  to  support  the  everyday 
operation  of  the  university, 

(d)  Commercial  -  worksteps  run  by  non-university  users, 

(e)  Overhead  -  billing  programs  and  other  worksteps  run  to 
support  the  operation  of  the  DPC. 

Although  the  proportion  of  the  workload  in  each  of  these  cate¬ 
gories  varies,  during  October/November  1978,  the  breakdown  was  Teach¬ 
ing  -  58%,  Research  -  18%,  Administrative  -  6%,  and  Commercial /Over¬ 
head  -  18%.  It  should  be  noted  that  these  are  proportions  of  the 
total  number  of  worksteps  processed  rather  than  of  total  resource 
utilization. 

For  this  study,  the  workload  for  the  period  January  1,  1978  to 
November  30,  1978  was  examined.  There  were  a  total  of  912,327  work- 
steps  processed  during  this  period,  which  accounted  for  2944.66  hours 
of  chargeable  CPU  time.  The  following  relative  frequency  histograms 
show  the  distribution  of  the  worksteps/  CPU  time  over  the  eleven  month 
period. 


Fig.  6.1  Relative  Frequency  Histogram  for  Worksteps  Processed  -  Monthly 
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The  structure  of  the  histogram  depicting  the  proportion  of  work- 
steps  processed  closely  follows  the  academic  terms.  The  spring  semes¬ 
ter  began  in  late  January  and  continued  through  mid  May;  the  summer 
term  ran  from  early  June  to  mid  August;  and  the  fall  term  began  in 
late  August  and  ran  into  December.  The  histogram  depicting  the  propor¬ 
tion  of  CPU  time  shows  that  the  period  of  maximum  utilization  of  the 
processor  actually  occurred  during  May  and  August,  a  time  of  relatively 
low  student  usage.  This  was  caused  by  a  heavy  administrative  workload 
during  those  two  periods.  Grade  reports  are  processed  in  May  account¬ 
ing  for  that  "hump";  both  grade  reports  and  normal  end-of-the-fiscal- 
year  processing  account  for  the  August  "hump". 

For  this  study,  it  was  decided  to  examine  a  period  which  exhibited 
a  balance  in  the  types  of  worksteps  processed.  The  period  selected 
was  a  two  week  period,  September  20  -  October  3.  This  period  should 
exhibit  the  desired  balance,  since  it  begins  approximately  one  fourth 
of  the  way  into  the  fall  semester.  Thus,  the  distortion  caused  by 
end-of-semester  administrative  processing  is  avoided.  Furthermore,  it 
is  far  enough  into  the  semester  so  that  student/research  activity  is 
relatively  heavy. 

The  workload  during  the  period  of  interest  displayed  a  strong 
weekly  trend.  This  is  caused  largely  by  the  work  week  and  operating 
hours  of  the  various  remote  processing  centers.  There  was  a  total  of 
46,730  worksteps  processed  during  the  two  week  period,  which  resulted 
in  127.03  hours  of  chargeable  CPU  time.  The  following  relative  fre¬ 
quency  histograms  depict  the  distribution  throughout  the  week. 


Fig.  6.3  Relative  Frequency  Histogram  for  Worksteps  Processed 
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The  seeming  contradictions  in  the  above  histograms  are  caused  by 
student  jobs.  A  "happy  hour"  period  is  provided  from  7:30  -  9:00  P.M. 
on  Sundays;  12:30  -  1:00  P.M.  and  8:30  -  10:00  P.M.  on  Mondays  through 
Thursdays;  and  12:30  -  1:00  P.M.  on  Fridays  during  which  jobs  using  the 
student  compilers  are  run  without  charge.  Thus,  the  job  counts  during 
these  periods  are  abnormally  high.  CPU  utilization  is  not  affected 
to  the  same  degree,  since  these  jobs  are  characteristically  very 
minimal  in  terms  of  processing  requirements. 

Using  this  profile  as  a  guide,  a  two  hour  period  was  selected 
as  the  workload  subset  for  use  in  the  remainder  of  the  study.  In  an 
effort  to  keep  the  scope  of  the  study  reasonable,  it  was  decided  to 
restrict  it  to  the  batch  portion  of  the  system  workload.  It  should  be 
understood  that  to  produce  a  realistic  test  workload,  the  interactive 
portion  of  the  workload  would  also  have  to  be  considered.  This  analy¬ 
sis  should  parallel  that  of  the  batch  workload,  with  the  two  types  of 
workloads  merged  at  the  end  to  provide  a  composite  test  workload. 

The  two  hour  period  form  9:00  -  11:00  A.M.  on  September  20,  1978 
was  selected,  again  to  yield  a  balanced  workload.  This  period  avoids 
the  influx  of  student  jobs  caused  by  "happy  hour",  and  is  contained 
within  the  normal  workweek  so  that  administrative/overhead  jobs  are 
represented.  There  were  338  jobs  processed  during  this  period,  with 
170  of  them  compiled  using  the  in-core  student  compilers  (Autobatch 
jobs)  and  168  of  them  using  the  standard  OS  translators  (Batch  jobs). 
These  two  portions  of  the  batch  workload  were  analyzed  separately  due 
to  the  severe  restriction  in  resource  utilization  placed  upon  the 
Autobatch  jobs. 
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6.4  Analysis  of  Autobatch  Data 

There  are  a  total  of  five  Autobatch  language  translators  provided 
for  use  with  small  jobs  which  require  limited  I/O  support.  In  addition, 
a  subset  of  the  Statistical  Analysis  System  (SAS76)  is  provided  in-core. 
The  resource  demand  patterns  for  jobs  executed  on  these  student-oriented 
translators  are  very  similar.  Input  data  is  through  the  standard  input 
file  (card);  output  is  either  in  printed  or  punched  form;  a  common 
region  size  (256  kilobytes)  is  used;  and  access  to  external  files  is 
prohibited.  Furthermore,  restrictions  are  placed  on  the  maximum  CPU 
time  utilized  and  maximum  output  produced. 

Due  to  the  similarities  in  the  resource  demand  characteristics  of 
.  Autobatch  jobs,  a  limited  resource  descriptor  set  is  adequate  to  repre¬ 

sent  their  contribution  to  the  system  workload.  The  descriptor  set 
selected  includes  CPU  time  (.01  sec),  number  of  cards  read,  number  of 
lines  printed,  and  number  of  cards  punched.  Data  was  collected  on  these 
four  variables  during  the  selected  period.  Of  the  170  Autobatch  jobs 
processed,  none  punched  cards.  Thus,  this  descriptor  was  eliminated 
from  the  set.  This  resulted  in  a  three  variable  set  denoted  hereafter 
as  {X1,X2,X3},  where  X-j  =  number  of  cards  read,  X2  =  number  of  lines 
printed,  and  X3  =  CPU  time  in  .01  second  increments. 

The  170  jobs  were  spread  fairly  uniformly  throughout  the  period. 

The  interarrival  time  distribution  is  depicted  in  figure  6.5.  This 
distribution  Is  not  crucial  to  the  analysis  of  this  section.  It  must 
be  considered,  however,  when  constructing  the  final  test  workload. 


Fig.  6.5  Interarrival  Distribution  -  Autobatch 


The  resource  demands  of  the  Autobatch  jobs  are  summarized  in  table 

6.1. 

Table  6.1  Resource  Demand  Characteristics  -  Autobatch 


X1 

x2 

X3 

Min 

4 

9 

1 

Max 

873 

1328 

229 

Mean 

129.4 

162.7 

18.4 

Std  Dev 

152.4 

203.9 

31.8 

The  variables  were  first  standardized  to  mean  0,  variance  1.  Then, 
the  intercorrelations  among  the  variables  were  examined.  These  correla¬ 
tions  are  summarized  in  table  6.2. 
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Table  6.2  Correlation  Matrix  -  Autobatch 


X1 

x2 

X3 

X1 

x' 

1.000 

0.696 

0.547 

0.696 

1.000 

0.758 

X2 

*3 

0.547 

0.758 

1.000 

The  standardized  variables  X^,X2»  and  Xj  were  then  subjected  to 
principal  component  analysis  to  transform  them  to  the  uncorrelated 
variables  Y-|,Y2,  and  Y^.  This  analysis  produced  the  following  linear 
relations  for  the  composite  variables. 

Y1  =  0.55052X1  +  0.60964X2  +  0.57032X3 

Y2  =  0.76719X1  +  0.10011X2  +  0.63356X3 

Y3  =  0.32915X?  +  0.78633X2  +  0.52282X3 

The  eigenvalues  corresponding  to  the  three  principal  components, 
the  portion  of  the  variability  in  the  data  explained  by  each  principal 
component,  and  the  cumulative  portion  are  displayed  in  the  following 
table. 


Table  6.3  Principal  Components  for  Autobatch  Data 


Y1 

Y2 

Y3 

Eigenvalues 

2.337483 

0.457655 

0.204861 

Portion 

0.779 

0.153 

0.068 

Cum  Portion 

0.779 

0.932 

1.000 

A  table  similar  to  table  6.3  is  useful  in  deciding  how  many  of  the 


components  to  retain  for  the  clustering  stage.  Due  to  the  limited 
number  of  variables  involved,  and  the  fact  that  the  least  significant 
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component  (Yj)  accounts  for  nearly  7%  of  the  variability,  no  attempt 
was  made  to  reduce  the  dimensionality  in  this  case. 

It  is  tempting  when  using  principal  component  analysis  to  try 
to  attach  a  physical  meaning  to  the  components.  Since  In  this  study, 
principal  components  are  isolated  to  examine  the  bias  caused  by  the 
intercorrelations  and  give  insight  into  possible  reduction  of  the  dimen¬ 
sion  of  the  feature  space,  no  attempt  was  made  to  attach  such  a  meaning. 
It  is,  however,  interesting  to  note  the  intercorrelations  among  the 
original  standardized  variables  and  the  component  variables.  These 
intercorrelations  are  shown  in  table  6.4. 


Table  6.4  Intercorrelations  Among  Variables  -  Autobatch 


Y1 

Y2 

Y3 

X1 

x‘ 

0.84169 

0.51901 

0.14898 

0.93206 

-0.06772 

-0.35591 

xz 

*3 

0.87195 

-0.42860 

0.23664 

Once  the  component  scores  were  calculated,  they  were  input  to  the 
clustering  algorithm  detailed  in  Appendix  B.  The  algorithm  was  run 
iteratively  for  various  number  of  clusters,  and  the  sum  of  the  squared 
deviations  about  the  cluster  means  examined  to  determine  an  appropriate 
number  of  clusters.  A  plot  of  this  measure  is  depicted  in  figure  6.6. 

There  is  an  obvious  compromise  to  be  made  between  obtaining  very  "tight" 
clusters  and  forming  the  minimum  number  of  clusters  necessary.  For  this 
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data,  a  reasonable  compromise  appeared  to  be  five  clusters. 


Fig.  6.6  Plot  of  Cluster  "Tightness"  -  Autobatch 


The  five  clusters  formed  exhibited  markedly  different  resource 
demand  patterns.  To  depict  the  difference,  the  approximate  fractile 
rankings  of  the  cluster  centroids  were  plotted  on  Kiviat  t^aphs  [59. 
69,71].  These  graphs,  scaled  from  0  at  the  center  to  1  on  the  peri¬ 
meter,  are  shown  in  the  following  figures. 
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Fig.  6.11  Kiviat  Graph  for  Cluster  5  -  Autobatch 


Examination  of  the  Kiviat  graphs  reveals  some  similarity  of 
structure.  For  example,  both  clusters  4  and  5  are  severely  imbalanced 
in  favor  of  components  Y2  and  Y3-  Clusters  1  and  3,  on  the  other  hand, 
are  imbalanced  in  favor  of  components  Y-|  and  Y^.  Similarity  of  the  Kiviat 
graphs  may  tempt  the  analyst  to  consolidate  the  two  similar  clusters 
into  one  composite  cluster.  This  may  be  feasible  in  some  cases,  how¬ 
ever  it  should  be  done  with  care.  The  Kiviat  graphs  display  approxi¬ 
mate  fractile  rankings,  and,  depending  upon  the  variance  in  the 
components,  a  slight  difference  in  the  fractile  ranking  can  involve 
a  significant  difference  in  the  magnitude  of  the  components. 

The  interpretation  of  the  clusters  in  terms  of  principal  components 
is  difficult,  since  no  physical  significance  was  attached  to  the 
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components.  For  this  reason,  examination  of  the  clusters  in  the  origi¬ 
nal  space  is  necessary  before  consolidation  of  clusters  Is  considered. 
The  cluster  characteristics  in  terms  of  the  original  unsealed  variables 
are  depicted  in  table  6.5. 


Table  6.5  Cluster  Compositions  -  Autobatch 


1 

2 

Cluster 

3 

4 

5 

Number 

28 

27 

16 

66 

33 

X, (Mean) 
x{(std  dev) 
x'(Mean) 

130.36 

214.85 

491.00 

29.71 

82.64 

30.47 

109.23 

182.60 

13.58 

22.36 

165.64 

265.74 

664.13 

41.74 

74.79 

Xo(Std  dev) 

59.67 

111.74 

253.68 

16.04 

17.09 

Xo(Mean) 

13.57 

35.22 

86.56 

3.36 

5.91 

X^Std  dev) 

9.48 

15.23 

60.43 

1.38 

3.54 

Examination  of  table  6.5  reveals  that  there  are  indeed  significant 
differences  in  the  magnitude  of  the  demands  between  the  "similar" 
clusters.  No  consolidation  was  attempted  for  this  reason. 

6.5  Analysis  of  Batch  Jobs 

The  restrictions  placed  upon  the  allowable  resource  demands  for 
Autobatch  jobs  are  not  applied  to  jobs  using  the  standard  OS  translators 
(Batch  jobs).  This  necessitates  an  expanded  resource  descriptor  set  to 
adequately  characterize  Batch  jobs,  since  the  range  of  the  resource 
demands  is  much  broader  for  these  jobs,  both  in  scope  and  magnitude. 

A  set  of  12  descriptor  variables  was  selected  to  represent  the 
demands  placed  on  the  system  by  Batch  jobs.  These  are 

(a)  =  number  of  job  steps  executed, 

(b)  Xg  =  total  number  of  devices  used  by  the  job. 


sacasKB 
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(c)  X3  i  region  size  requested  in  kilobytes, 

(d)  X4  =  number  of  cards  read, 

(e)  X5  =  number  of  lines  printed, 

(f)  Xg  =  number  of  cards  punched, 

(g)  x7  =  number  of  pages  read  in, 

(h)  Xg  =  number  of  pages  read  out, 

(i)  Xg  =  CPU  time  in  .01  second  increments, 

(j)  X-jQ=  I/O  time  in  .01  second  increments, 

(h)  X^s  EXCP  count  issued  to  tape  devices,  and 

(1)  X^5  EXCP  count  isued  to  disk  devices  (excluding  HASP 

pseudo  devices).  These  12  variables  represent  the  demands  placed  upon 
the  major  system  resources.  They  also  allow  discrimination  between 
different  types  of  jobs,  such  as  those  which  do  tape  I/O  versus  disk 
I/O,  or  single  step  versus  multi  step  jobs.  An  expanded  feature  set 
could  be  used  if  desired,  since  reduction  of  the  dimensionality  of  the 
feature  space  is  a  part  of  the  proposed  methodology. 

There  were  a  total  of  168  Batch  jobs  processed  during  the  selected 
period.  The  interarrival  distribution  of  these  jobs  is  similar  to 
that  of  the  Autobatch  jobs  as  seen  in  figure  6.12. 

The  168  Batch  jobs  exhibited  a  widely  varying  pattern  of  resource 
demands  as  illustrated  in  table  6.6. 

As  with  the  Autobatch  data,  the  variables  were  first  standardized, 
the  correlations  examined,  and  principal  component  analysis  performed. 
These  stages  of  the  analysis  are  summarized  in  tables  6.7  and  6.8. 


XXX 


Min _ Max _ Mean  _  Std  Dev 


1 

6 

1.56 

1.03 

2 

61 

12.18 

10.68 

64 

512 

159.24 

80.76 

5 

4619 

257.87 

655.33 

0 

24979 

1872.69 

4771.51 

0 

6548 

76.11 

674.57 

0 

440 

31.73 

49.16 

0 

384 

12.43 

34.58 

2 

12731 

336.39 

1193.59 

0 

29998 

812.40 

2948.55 

0 

33020 

354.67 

2612.71 

0 

47677 

1009.78 

4556.96 

Correlation  Matri 


Cam  Portion  0.35  0.51  0.63  0.72  0.80  0.87  0.93  0.96  0.98  0.99  1.00 


Examination  of  table  6.8  shows  that  96%  of  the  total  variance  in 


the  data  can  be  explained  by  retaining  only  8  of  the  12  components. 
These  8  most  significant  components  were  selected  to  be  input  to  the 
clustering  algorithm.  The  intercorrelations  among  these  8  most  signi¬ 
ficant  components  and  the  12  original  variables  is  shown  in  table  6.9. 


Table  6.9  Intercorrelations  Among  Variables  -  Batch 


Y1 

Y2 

Y3 

Y4 

Y5 

Y6 

Y7 

Y8 

X1 

xj 

0.40 

0.47 

0.54 

-0.16 

0.07 

0.18 

-0.37 

0.36 

0.63 

0.41 

0.39 

-0.01 

0.09 

0.17 

-0.18 

-0.46 

X3 

0.38 

0.40 

0.43 

-0.08 

-0.23 

0.13 

0.65 

0.07 

X4 

0.01 

0.17 

0.22 

0.61 

0.70 

-0.22 

0.13 

0.05 

A 

0.22 

0.15 

-0.53 

0.41 

0.00 

0.69 

0.00 

0.05 

X6 

-0.09 

-0.13 

-0.19 

-0.66 

0.65 

0.24 

0.15 

-0.01 

X7 

0.89 

0.25 

-0.18 

-0.07 

0.02 

-0.13 

-0.04 

-0.09 

V 

*9 

0.83 

0.25 

-0.36 

-0.06 

0.02 

-0.20 

-0.05 

0.16 

0.80 

-0.51 

0.06 

0.09 

0.02 

0.07 

0.11 

0.06 

X10 

X11 

i2 

0.69 

-0.63 

0.17 

0.02 

-0.01 

0.01 

-0.02 

-0.01 

0.61 

-0.70 

0.25 

0.04 

0.01 

0.07 

-0.03 

0.02 

0.71 

0.29 

-0.49 

-0.07 

-0.01 

-0.26 

0.07 

0.00 

To  determine  a  reasonable  number  of  clusters  to  form,  a  procedure 
similar  to  that  used  with  the  Autobatch  data  was  followed.  The  plot 
of  the  total  summed  deviations  about  the  cluster  means  is  shown  in 
figure  6.13. 

Based  upon  the  plot  of  figure  6.13,  a  reasonable  compromise 
appeared  to  be  to  form  10  clusters.  The  approximate  fractile  rankings 
of  the  cluster  centroids  are  depicted  in  the  following  Kiviat  graphs. 
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The  Kiviat  graphs  show  a  distinct  structure  for  each  cluster,  thus 
it  is  unlikely  that  consolidation  of  any  of  the  clusters  would  be  bene¬ 
ficial.  The  cluster  compositions  in  terms  of  the  original  variables 
are  shown  in  table  6.10. 

6.6  Comparison  of  Clustering  Results 

The  intercorrelation  among  the  resource  descriptor  variables  biases 
the  results  of  the  clustering  phase  of  analysis.  Various  weighting 
schemes  were  proposed  ;n  chapter  IV  to  neutralize  this  bias.  In  this 
section,  the  clusters  achieved  when  these  weighting  schemes  were  applied 
to  the  Batch  workload  data  will  be  compared.  Similar  experiments  were 
conducted  using,  the  nutobatch  data  with  comparable  results.  Toward 
the  end  of  the  section,  the  clustering  results  achieved  by  retaining 
the  eight  most  significant  components  will  be  compared  to  those  which 
were  achieved  by  retaining  all  12  of  the  components,  using  the  same 
weighting  scheme  in  both  cases. 

The  Batch  workload  data  was  standardized  and  then  subjected  to 
principal  components  analysis.  The  component  scores  (all  components 
retained)  were  then  input  to  the  clustering  algorithm  detailed  in 
Appendix  B  with  three  different  weighting  schemes.  The  first  run  used 
an  unweighted  Euclidean  distance  metric.  The  second  two  runs  used  a 
weighted  Euclidean  distance  metric  with  =  l/xi  in  one  case  and 
Wi  =  a.  in  the  other  case. 

It  is  difficult  to  compare  the  partitions  achieved  using  different 
weighting  schemes  since  the  cluster  memberships  can  change  quite 
drastically.  Since  the  aim  of  applying  the  weighting  function  was  to 
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more  or  less  equalize  the  intracluster  variation  in  each  dimension, 
one  way  to  compare  the  results  obtained  is  to  examine  the  variance 
(or  standard  deviation)  for  each  variable  within  each  cluster.  It 
is  likely  not  possible  nor  desirable  to  achieve  true  equality.  This 
is  because  of  the  wide  disparity  in  the  variances  of  the  principal 
component  variables  when  all  data  units  are  considered  (i.e.  Var  (Y-j )  = 
4.22;  Var  (Y^)  =  0.06).  It  should  be  apparent,  however,  that  more 
homogeneous  clusters  are  formed  if  the  intracluster  variations  are 
small  and  nearly  equal. 

Since  the  clustering  algorithm  was  applied  to  the  principal 
component  scores,  any  comparison  made  is  most  meaningful  in  the  princi¬ 
pal  components  space.  The  intracluster  standard  deviations  for  each 
principal  component  variable  within  each  cluster  are  shown  in  tables 
6.11,  6.12,  and  6.13. 

Table  6.11  Cluster  Standard  Deviations  -  =  1 


2  3  4  5 


.46 

1.05 

8.74 

2.95 

3.67 

.20 

.75 

2.43 

2.09 

1.74 

.24 

.52 

3.88 

1.24 

1.03 

.04 

1.01 

.32 

.37 

.94 

.10 

.97 

.10 

.59 

.69 

.10 

.42 

.95 

.19 

.73 

.35 

.30 

.16 

.66 

.79 

.14 

.22 

1.14 

.28 

.28 

.06 

.10 

.39 

.39 

.10 

.03 

.20 

.15 

.15 

.06 

.02 

.03 

.15 

.24 

.08 

.03 

.04 

.09 

.11 

.03 

6  7  8  9  10 


1.01 

0.00 

2.41 

.55 

.84 

.45 

0.00 

1 .02 

.24 

.49 

.52 

0.00 

1.05 

.13 

.40 

.17 

0.00 

.48 

.07 

.28 

.28 

0.00 

.39 

.10 

.17 

.22 

0.00 

.42 

.09 

.47 

.69 

0.00 

.79 

.14 

.12 

.35 

0.00 

.54 

.16 

.26 

.17 

0.00 

.32 

.07 

.11 

.07 

0.00 

.06 

.02 

.04 

.03 

0.00 

.05 

.01 

.04 

.09 

0.00 

.06 

.03 

.03 
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Table  6.12  Cluster  Standard  Deviations  -  W..  *  1/x^ 


1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

Jl 

0.00 

3.42 

3.08 

2.03 

1.73 

1.27 

13.60 

5.36 

.97 

.84 

0.00 

1.08 

1.32 

.85 

.87 

.51 

6.15 

2.11 

.73 

.49 

Y3 

0.00 

1.18 

2.21 

1.18 

.93 

.34 

2.05 

1.27 

.78 

.40 

Y4 

0.00 

.93 

.72 

.38 

.84 

.16 

.35 

.47 

.24 

.28 

Y5 

0.00 

.90 

1.01 

.39 

.93 

.13 

.23 

.30 

.31 

.17 

Y6 

0.00 

.48 

.62 

.21 

.50 

.25 

.62 

.51 

.29 

.47 

Y7 

0.00 

.34 

.74 

1.01 

.57 

.43 

.35 

1.02 

.28 

.12 

Y/ 

Y8 

'q 

0.00 

.28 

.36 

.37 

.08 

.29 

.08 

.42 

.18 

.26 

0.00 

.11 

.41 

.17 

.07 

.13 

.10 

.33 

.08 

.11 

Yy 

:io 

Yn 

t12 

0.00 

.07 

.05 

.13 

.05 

.02 

1.11 

.09 

.03 

.04 

0.00 

.03 

.14 

.04 

.07 

.03 

.07 

.22 

.02 

.04 

0.00 

.04 

.08 

.09 

.02 

.05 

.06 

.04 

.02 

.03 

Table  6.13  Cluster 

Standard  Deviations  -  Wi  = 

1 

2 

3 

4 

5 

6 

7 

8 

9 

10 

.21 

.42 

2.74 

1.01 

0.00 

.84 

0.00 

.50 

.59 

.85 

.20 

.46 

2.64 

1.02 

0.00 

.89 

0.00 

.49 

.39 

.49 

1  ^ 

.22 

.52 

2.24 

.90 

0.00 

.41 

0.00 

.69 

.46 

.40 

Y4 

.07 

1.33 

.85 

.36 

0.00 

.52 

0.00 

1.27 

.14 

.28 

Y5 

.06 

1.31 

1.03 

.24 

0.00 

.58 

0.00 

1.12 

.35 

.17 

Y6 

.04 

.47 

.69 

.31 

0.00 

.35 

0.00 

.45 

.14 

.47 

Y7 

.29 

.40 

.96 

.71 

0.00 

.40 

0.00 

.99 

.93 

.12 

Y' 

Y8 

.13 

.12 

.81 

.42 

0.00 

.29 

0.00 

.45 

.46 

.26 

.04 

.05 

.52 

.13 

0.00 

.13 

0.00 

.15 

.11 

.11 

Yy 

ylO 

Y11 

12 

_ 

.01 

.08 

.10 

.15 

0.00 

.26 

0.00 

.06 

.07 

.04 

.01 

.02 

.26 

.10 

0.00 

.04 

0.00 

.05 

.02 

.04 

.02 

.03 

.09 

.12 

0.00 

.05 

0.00 

.05 

.06 

.03 

Close  examination  of  tables  6.11,  6.12,  and  6.13  tends  to 
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confirm  the  conclusions  of  Chapter  IV,  particularly  if  viewed  in  terms 
of  the  extreme  values  of  the  cluster  standard  deviations.  Table  6.11, 
based  upon  the  unweighted  distance  metric,  has  a  maximum  value  of  8.74, 
and  a  total  of  16  values  greater  than  1.  Table  6.12,  based  upon  the 
weighted  distance  metric  with  W.  =  1/x^,  has  a  maximum  value  of  13.60, 
and  a  total  of  20  values  greater  than  1.  Table  6.13,  based  upon  the 
weighted  distance  metric  with  W.  =  x^,  has  a  maximum  value  of  2.74,  and 
a  total  of  9  values  greater  than  1.  Thus,  the  weighting  scheme  with 
Wi  =  1/A.  actually  performs  worse  than  the  unweighted  scheme,  while 
significant  improvement  is  noted  when  =  X^  is  used. 

Of  particular  note  with  this  data  is  the  manner  in  which  the  three 
schemes  handled  outlier  jobs.  There  were  two  jobs  which  were  much 
larger  in  terms  of  resource  requirements  than  any  others  in  the  subset. 
Both  jobs  performed  an  excessive  amount  of  I/O,  with  one  accessing  tape 
devices  and  the  other  disk  devices.  Both  the  unweighted  version  and  the 
weighted  version  with  =  1/x^  grouped  at  least  one  of  these  outlier 
jobs  with  other  data  units,  thus  providing  a  very  inhomogeneous  cluster. 
Only  the  weighted  scheme  with  =  x.  "correctly"classified  these  two 
jobs  into  two  single  member  clusters. 

Comparison  of  the  results  obtained  with  a  weighted  distance  metric 
(W.  =  x.)  when  8  and  12  of  the  principal  components  are  retained  indi¬ 
cate  very  little  change.  Of  the  10  clusters  obtained  with  12  compo¬ 
nents,  5  of  them  remain  intact  when  only  8  components  are  retained 
(including  the  two  "outlier"  clusters  mentioned  above).  There  are 
but  minor  changes  in  4  of  the  5  remaining  clusters.  The  lone  cluster 
which  changed  drastically  was  a  small  cluster  (9  data  units)  in  which 
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even  minor  alterations  in  cluster  membership  can  have  a  dramatic  effect 
on  cluster  characteristics.  In  all,  15  of  the  168  data  units  migrated 
(i.e.  changed  clusters)  when  the  four  least  significant  components 
were  dropped.  This  performance  is  somewhat  related  to  the  weighting 
scheme  used.  Similar  experiments  were  conducted  using  the  unweighted 
distance  metric  and  the  weighted  distance  metric  (W^  =  1 / x ^ ) .  The 
effect  of  dropping  the  four  least  significant  components  was  more 
severe  with  these  two  schemes. 

6.7  Construction  of  Synthetic  Jobs 

The  construction  of  synthetic  jobs  to  replace  the  real  jobs  in  the 
selected  workload  subset  is  the  next  logical  step  following  clustering. 

A  separate  synthetic  job  is  generally  required  to  represent  each  cluster. 
There  may  be  exceptions  to  this  however.  Two  clusters  may  be  similar 
enough  that  a  single  synthetic  job  can  be  used  to  represent  the  jobs 
in  the  composite  cluster  formed  by  merging  the  two.  A  single  cluster, 
on  the  other  hand,  may  be  too  "loose"  to  allow  adequate  representation 
of  its  members  with  a  single  synthetic  job.  Such  a  cluster  must  be 
split  into  subclusters,  each  of  which  is  represented  by  a  separate 
synthetic  job.  After  synthetic  jobs  are  constructed  for  each  cluster/ 
subcluster,  the  synthetic  mix  can  be  formed  by  including  the  appropri¬ 
ate  number  of  copies  of  each  job  and  appending  the  arrival  time  to  each. 

Synthetic  jobs  were  constructed  for  one  Autobatch  cluster  and 
one  Batch  cluster  to  illustrate  the  design  technique.  The  Autobatch 
cluster  selected  was  cluster  4  (see  table  6.5  ( p . 91 ) ) ,  while  the  Batch 
cluster  selected  was  cluster  6  (see  table  6.10  ( p . 1 03 ) ) . 
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Synthetic  jobs  designed  to  represent  the  Autobatch  jobs  can  be 
very  simple  jobs  due  to  the  limited  resource  descriptor  set.  Three 
resource  demands  must  be  controlled:  =  number  of  cards  read,  X2  = 

number  of  lines  printed,  and  Xj  =  CPU  time  used  (.01  sec).  The  number 
of  cards  read  is  exactly  determined  by  the  number  of  source/comment 
statements  in  the  program  and  the  number  of  JCL/data  cards.  This 
number  can  be  varied  within  certain  limits  for  a  given  synthetic  job 
by  either  including  or  excluding  data/comment  cards.  The  number  of 
lines  printed  can  also  be  exactly  controlled  by  including  a  print  loop 
which  is  executed  the  desired  number  of  times.  CPU  time  used  is  con¬ 
trolled  by  executing  a  compute  loop  a  certain  number  of  times.  The 
amount  of  CPU  time  is  also  related  to  the  number  of  lines  printed, 
hence  this  dependence  must  be  accounted  for.  The  synthetic  job  designed 
for  Autobatch  cluster  4  is  described  in  detail  in  Appendix  D. 

The  synthetic  job  for  Autobatch  cluster  4  has  two  parameters  which 
may  be  varied  to  induce  various  resource  demand  patterns,  IS ‘se  par  ur¬ 
eters  are  NRLIN  =  the  number  of  lines  to  be  printed  and  NITER  ;  the 
number  of  times  the  compute  loop  is  to  be  executed.  The  size  of  the 
program  (number  of  cards  read)  was  held  constant  throughout.  These  two 
parameters  were  used  as  "treatments"  in  the  experimental  design  used. 
Three  "levels"  for  each  "treatment"  were  established  to  cover  the 
range  of  resource  demands  exhibited  by  the  members  of  Autobatch  cluster 
4.  This  results  in  nine  unique  treatment/level  combinations.  A  com¬ 
pletely  randomized  factorial  design  (32)  was  used  to  establish  the 
parameter  settings  for  the  nine  required  runs  of  the  job.  The  parameter 
setting  for  each  run  are  shown  in  table  6.14. 
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Table  6.14  Parameter  Settings  -  Autobatch 


1 

2 

3 

4 

Run 

5 

6 

7 

8 

9 

NRLIN 

m 

50 

50 

150 

150 

0 

0 

150 

0 

NITER 

1 

5000 

2500 

50 

5000 

50 

5000 

2500 

The  nine  programs  were  run  on  the  system  and  data  collected  which 
reflected  the  resource  demands  of  each  program.  This  data  is  summarized 
in  table  6.15. 

Table  6.15  Resource  Demands  -  Synthetic  Autobatch  Job 

Run 


1 

2 

3 

4 

5 

6 

7 

8 

9 

33 

33 

33 

33 

33 

33 

33 

33 

33 

88 

88 

88 

188 

188 

38 

39 

188 

38 

5 

77 

41 

Q 

43 

74 

3 

80 

38 

The  significance  of  the  effect  of  varying  NITER  and  NRLIN  on  Xj 
was  then  tested.  Both  ''treatments"  were  found  to  be  highly  significant 
(a  =  .0001).  The  model  used  assumed  no  interaction  between  the  param¬ 
eters.  The  amount  of  CPU  time  used  (X^)  was  regressed  on  NITER  and 
NRLIN,  while  the  number  of  lines  printed  (X^)  was  regressed  on  NRLIN. 

The  following  predictor  equations  were  obtained  through  this  regression: 

X2  *  38.238  +0.998  NRLIN 
X3  =  2.399  +  0.037  NRLIN  +  0.014  NITER. 


no 


The  fit  achieved  by  both  regression  equations  was  extremely  good.  The 
value  of  the  multiple  correlation  coefficient  (proportion  of  the  variabil¬ 
ity  explained)  was  0.999978  for  the  equation  relating  X2  to  NRLIN,  and 
0.999679  for  the  equation  relating  to  NRLIN  and  NITER. 

Synthetic  jobs  designed  to  represent  Batch  jobs  must  be  considera¬ 
bly  more  complex  than  those  for  Autobatch  jobs  due  to  the  expanded 
resource  descriptor  set.  The  descriptor  set  used  for  the  Batch  jobs 
includes  12  variables.  A  number  of  these  can  be  exactly  controlled 
through  Job  Control  Language  (JCL)  statements  cr  the  inclusion  /ex¬ 
clusion  of  data/comment  cards.  Others  must  be  controlled  through 
parameters. 

The  synthetic  job  designed  for  Batch  cluster  6  (described  in 
Appendix  D)  has  four  parameters  which  can  be  varied  to  induce  different 
resource  demand  patterns.  They  are  NITER  5  the  number  of  times  the 
compute  loop  is  executed,  NOUT  =  the  number  of  output  lines  produced, 
NTAP  =  the  number  of  records  read  from  a  tape  file,  and  NDIS  s  the 
number  of  records  read  from  a  disk  file.  Those  resource  demands  which 
are  not  affected  by  varying  these  parameters  were  held  constant  through¬ 
out  the  experiment. 

Two  levels  for  each  parameter  were  selected.  A  completely 
randomized  factorial  design  (24)  was  used  to  establish  the  parameter 
settings  for  the  various  runs  of  the  program.  This  design  requires  16 
runs  to  form  one  replication  of  the  experiment.  This  was  considered 
excessive  due  to  the  cost  associated  with  each  run.  It  was  decided  to 
use  a  fractional  replication  for  this  reason.  A  one  half  fractional 
replication  requires  only  eight  runs,  but  still  allows  testing  of  the 
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main  treatment  effects.  The  effect  of  interaction  among  parameters  was 
assumed  negligible  just  as  with  the  Autobatch  experiment.  Using  the 
method  illustrated  in  Hicks  [43],  the  "treatment"  combinations  were 
divided  into  two  blocks,  with  the  four-way  interaction  effect  con¬ 
founded  with  the  block  effect.  A  coin  flip  was  used  to  decide  which 
of  the  blocks  to  use  in  the  experiment.  The  parameter  settings  for  the 
eight  required  runs  of  the  job  are  listed  in  table  6.16. 


Table  6.16  Parameter  Settings  -  Batch 


1 

2 

3 

4 

Run 

5 

6 

7 

8 

NITER 

1000 

0 

1000 

0 

0 

0 

1000 

1000 

NOUT 

0 

0 

1000 

0 

1000 

1000 

0 

1000 

NTAP 

0 

1000 

1000 

0 

0 

1000 

1000 

0 

NDIS 

0 

0 

0 

1000 

0 

1000 

1000 

1000 

The  synthetic  jobs  were  run  on  the  system,  and  data  collected 
reflecting  the  resource  demands.  This  data  is  shown  in  table  6.17. 

The  values  for  all  12  resource  descriptors  are  shown;  those  which  are 
not  affected  by  the  four  parameters  appear  as  constants.  No  attempt 
was  made  to  control  paging  behavior  as  this  is  largely  environment 
dependent. 

Table  6.17  shows  that  five  of  the  12  resource  descriptors  are 
affected  by  varying  the  four  parameters.  These  are  (number  of  lines 
printed),  Xg  (CPU  time  used  in  .01  sec  increments),  X^Q  (I/O  time  used 
in  .01  sec  increments),  X^  (EXCP  count  to  tape  devices),  and  X^  (EXCP 
count  to  disk  devices).  The  significance  of  the  effect  of  the  parameters 


1 

2 

3 

4 

Run  5 

6 

7 

8 

X1 

x' 

1 

1 

1 

1 

1 

1 

1 

1 

14 

14 

14 

14 

14 

14 

14 

X3 

128 

128 

128 

128 

128 

128 

128 

X4 

240 

240 

240 

240 

liSc  Mi 

240 

240 

240 

X5 

354 

351 

1349 

351 

1349 

1349 

351 

1349 

X6 

0 

0 

0 

0 

0 

0 

0 

0 

X7 

0 

0 

0 

0 

0 

0 

0 

0 

K 

X8 

aQ 

0 

0 

0 

0 

0 

0 

0 

0 

242 

109 

332 

111 

191 

201 

247 

329 

Cio 

An 

X 

A12 

188 

213 

212 

233 

187 

256 

257 

232 

1 

10 

10 

1 

1 

10 

10 

1 

132 

132 

132 

217 

132 

217 

217 

217 

on  the  descriptor  variables  was  tested.  Using  a  level  of  significance 
a  =  .05,  the  effect  on  Xg  was  significant  for  NITER  and  NOUT;  the 
effect  on  was  significant  for  NOUT,  NTAP,  and  NDIS;  the  effect  on 
X5  was  significant  for  NOUT;  the  effect  on  X^  was  significant  for 
NTAP;  and  the  effect  on  X-|2  was  significant  for  NDIS. 

The  descriptor  variables  were  then  regressed  on  those  parameters 
which  were  identified  as  having  a  statistically  significant  effect. 

The  resulting  regression  equations  with  the  value  of  the  multiple 
correlation  coefficient  indicated  in  parentheses  are 
X5  =  351.75  +  0.99725N0UT  (R2  =  0.999997), 

Xg  =  110.00  +  0. 1345NITER  +  0.0860N0UT  (R2  =  0.998648), 

X1Q=  188.25  -  0.001 NOUT  +  0.025NTAP  +  0.0445NDIS  (R2  =  0.999903), 
X^=  1.00  +  0.009NTAP  (R2  =  1.000000),  and 
X]2=  132.00  +  0.085NDIS  (R2  =  1.000000). 


The  problem  with  inverting  the  above  equations  to  yield  predictor 
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equations  for  the  parameter  settings  is  that  there  is  one  equation  too 
many  (i.e.  5  equations  in  4  unknowns).  The  equation  for  X10  however 
is  seen  to  be  redundant,  since  I/O  time  is  uniquely  determined  by  the 
quantity  and  type  of  I/O  performed.  Inverting  the  remaining  relations 
yields  the  following  predictor  equations 
NOUT  »  1.00276X5  -  352.72, 

NITER  =  7.4349Xg  -  0.6411X5  -  592.27, 

NTAP  =  lll.llllXn  -  111.11,  and 
NDIS  =  11 .7647X^2  -  1552.95. 


6.8  Sumnary 

A  statistical  methodology  proposed  for  use  in  constructing  test 
workloads  was  developed  in  Chapters  III,  IV,  and  V.  The  major  elements 
of  this  methodology  are  illustrated  in  this  chapter  with  a  detailed 
case  study  of  the  workload  processed  by  the  Amdahl  470/V6  at  Texas 
A&M  University. 

The  first  task  in  constructing  a  test  workload  is  determining 
a  subset  of  the  real  workload  to  use  as  a  model.  The  appropriate 
workload  subset  is  related  to  the  particular  evaluation  study  being 
performed.  An  overall  workload  profile  can  be  constructed,  and  an 
applicable  subset  selected  by  viewing  the  characteristics  displayed  in 
the  profile.  This  study  was  not  directed  toward  any  particular  evalu¬ 
ation  effort,  hence  the  choice  of  the  subset  was  somewhat  arbitrary. 

The  selected  workload  subset  was  found  to  be  composed  of  two 


basic  types  of  jobs,  those  using  the  student  compilers  (Autobatch) 
and  those  using  the  standard  OS  translators  (Batch).  A  limited 
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resource  descriptor  set  is  adequate  for  characterizing  the  resource 
demands  of  the  Autobatch  jobs  while  an  expanded  set  is  required  for 
Batch  jobs.  The  two  types  of  jobs  were  analyzed  separately  for  this 
reason. 

The  resource  demand  matrix  was  first  scaled  so  that  each  descriptor 
variable  had  a  mean  of  0  and  a  variance  of  1.  This  scaled  matrix  was 
then  subjected  to  principal  component  analysis,  to  transform  the 
demand  vectors  to  a  space  of  uncorrelated  composite  variables.  Those 
component  variables  necessary  to  explain  95%  of  the  total  variability 
were  retained.  This  resulted  in  the  retention  of  all  three  of  the 
component  variables  for  the  Autobatch  data.  Only  eight  of  the  12 
component  variables  for  the  Batch  data  were  retained,  however,  reducing 
the  dimensionality  of  the  problem  by  one  third. 

The  principal  component  scores  were  input  to  a  non-hierarchical 
clustering  algorithm  using  a  weighted  Euclidean  distance  metric. 

Various  weighting  schemes  were  tried,  with  the  "best"  results  obtained 
by  weighting  each  component  variable  by  the  proportion  of  the  variabil¬ 
ity  it  explains.  The  numbers  of  clusters  to  form  in  each  case  was 
determined  somewhat  subjectively  by  iteratively  running  the  algorithm 
for  various  numbers  of  clusters  and  examining  the  sum  of  the  squared 
deviations  about  the  cluster  centroids.  The  results  of  the  clustering 
algorithm  were  illustrated  using  Kiviat  graphs  which  displayed  the 
approximate  fractile  ranking  of  the  cluster  centroids  for  each  cluster. 
Kiviat  graphs  were  not  originally  designed  for  this  purpose.  They  are 
useful,  however,  in  presenting  the  multidimensional  nature  of  workload 
data. 


Two  clusters,  one  for  Autobatch  and  one  for  Batch,  were  selected 
as  models  to  use  in  the  design  of  synthetic  jobs.  Following  the  design 
of  the  two  jobs,  a  completely  randomized  factorial  design  was  used  to 
guide  the  collection  of  data  and  to  test  the  significance  of  the  effects 
that  the  synthetic  job  parameters  have  on  the  various  resource  demands. 
Regression  analysis  was  performed  to  yield  predictor  equations  for  the 
resource  demands  as  functions  of  the  synthetic  job  parameters. 
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CHAPTER  VII 

SUMMARY  AND  CONCLUSIONS 

7.1  Review  of  the  Proposed  Methodology 

The  construction  of  a  representative  test  workload  is  an  integral 
part  of  any  computer  performance  evaluation  study.  A  methodology 
which  is  proposed  for  use  in  constructing  test  workloads  has  emerged 
from  this  research.  The  major  elements  of  this  methodology  are 

(a)  selecting  the  workload  subset  by  constructing  an  overall 
workload  profile  and  then  choosing  a  period  which  exhibits  character¬ 
istics  pertinent  to  the  evaluation  study, 

(b)  choosing  a  set  of  descriptor  variables  which  is  detailed 
enough  to  represent  the  demand  placed  upon  the  major  system  resources, 
but  is  not  so  detailed  as  to  complicate  later  stages  of  analysis, 

(c)  collecting  data  reflecting  the  values  of  the  descriptor 
variables  for  the  worksteps  in  the  selected  subset, 

(d)  scaling  the  resource  demand  matrix  so  that  each  descriptor 
variable  has  mean  0  and  variance  1, 

(e)  applying  principal  components  analysis  to  the  scaled  resource 
demand  matrix  and  retaining  only  those  components  needed  to  explain 
the  major  part  of  the  variability  in  the  data, 

(f)  clustering  the  transformed  resource  demand  vectors  in  the 
principal  components  space  using  a  nori-hierarchical  clustering 


algorithm  with  a  weighted  Euclidean  distance  measure, 

(g)  designing  synthetic  jobs  for  each  of  the  isolated  clusters 
using  regression  analysis  to  obtain  predictor  equations  for  the  param¬ 
eter  settings, 

(h)  forming  a  synthetic  job  mix  by  combining  a  sufficient  number 
of  copies  of  the  various  synthetic  jobs  with  appropriate  parameter 
settings  and  the  desired  arrival  time  of  each,  and 

( i )  validating  the  generated  synthetic  job  mix  by  executi ng  i t 
on  the  system  being  studied,  comparing  its  characteristics  with  those 
of  the  real  workload  subset,  and  adjusting  the  parameter  settings  as 
necessary. 

7.2  Automatic  Generation  of  Test  Workloads 

The  construction  of  test  workloads  is  a  time  consuming,  tedious 
and  error  prone  procedure.  Using  the  proposed  methodology,  the  major 
portion  of  this  task  can  be  automated.  Automation  will  release  the 
analyst  from  this  tedious  chore.  It  will  also  provide  benefits  in  the 
areas  of  flexibility,  ease  of  modification,  and  reproducibility.  This 
section  will  describe  the  design  of  an  automatic  benchmark  generator 
based  upon  the  proposed  methodology. 

It  is  not  likely  that  the  first  three  elements  of  the  proposed 
methodology  can  be  automated  to  any  degree.  Considerable  insight  is 
required  to  select  an  appropriate  workload  subset  and  to  determine  the 
set  of  descriptor  variables  which  will  adequately  represent  a  given 
workstep's  true  demand  on  the  system.  Furthermore,  the  criteria 
used  to  judge  a  workload  subset  applicable  to  a  given  study  changes 
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from  one  study  to  the  next.  One  study  may  require  an  I/O  bound 
workload;  another  study  may  require  a  compute  bound  workload;  and  a 
third  study  may  require  a  balanced  workload.  It  is  a  straightforward 
task  to  collect  the  appropriate  data,  once  the  desired  workload  subset 
is  selected  and  the  descriptor  variables  determined.  In  the  remainder 
of  this  section,  then,  it  will  be  assumed  that  the  real  workload  is 
presented  to  the  generator  in  the  form  of  a  resource  demand  matrix. 

The  arrival  time  of  the  request,  possibly  its  originating  location  if 
operating  in  a  distributed  environment,  and  a  flag  indicating  the  type 
of  workstep  (i.e.  transaction,  job)  are  appended  to  each  resource 
demand  vector. 

The  characterization  phase  of  the  analysis  combining  scaling, 
principal  components  analysis,  and  clustering  can  be  easily  automated. 

It  is  envisioned  that  the  various  classes  of  workload  requests  (i.e. 
batch,  time-sharing,  and  real-time)  would  be  first  segregated.  Analysis 
would  proceed  separately  on  the  different  classes.  Some  decisions 
would  still  need  to  be  made  by  the  analyst.  These  include  how  many  of 
the  principal  components  to  retain  and  how  many  clusters  to  form  if 
non-hierarchical  clustering  is  used.  The  first  decision  on  retention 
of  principal  components  can  be  built  into  the  generator.  That  is,  it 
may  be  decided  to  retain  sufficient  components  to  explain  a  particular 
proportion  of  the  variability  in  all  cases.  The  second  decision  is 
not  so  readily  made,  since  the  "optimal"  number  of  clusters  to  form  is 
largely  data  dependent.  There  is  the  need  for  a  clustering  algorithm 
which  does  not  require  this  decision. 

The  next  two  elements  of  the  methodology  are  also  amenable  to 


automation.  It  would  require  the  construction  of  a  library  of  general 
purpose  synthetic  jobs.  This  library  must  contain  synthetic  versions 
of  batch  processing  as  well  as  transaction  oriented  jobs.  The  appro¬ 
priate  synthetic  job  would  be  selected  from  this  library  by  first 
determining  the  type  of  job  (i.e.  batch  or  interactive)  needed  by 
examining  the  flag  appended  to  the  resource  demand  vector.  The  required 
resource  demands  would  then  be  compared  against  those  demands  which 
could  be  produced  by  the  various  library  jobs.  The  appropriate  param¬ 
eter  settings  could  then  be  calculated  using  previously  developed 
predictor  equations.  Following  the  selection  of  library  jobs  and 
the  determination  of  the  required  parameter  settings,  the  synthetic  mix 
could  be  generated  by  considering  the  time  and  location  of  origin  for 
each  workstep. 

Calibration/validation  of  the  produced  synthetic  job  mix  is 
necessary  to  assure  its  representativeness.  This  requires  that  the 
synthetic  mix  be  executed  on  the  system,  and  data  collected  on  the 
resources  used.  The  resource  utilization  pattern  for  the  synthetic 
mix  is  compared  to  that  of  the  original  workload  subset.  Parameters 
are  adjusted,  and  the  process  repeated  until  the  desired  agreement  is 
reached.  The  details  of  this  procedure  are  not  clear,  however  it 
appears  feasible. 

An  automatic  benchmark  generator  then  would  be  composed  of  three 
basic  modules:  a  characterization  module,  a  benchmark  generator  module 
and  a  calibration/validation  module.  These  modules  are  depicted  in 
figure  7.1. 
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Fig.  7.1  Automatic  Benchmark  Generator 


Workload  Data 
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7.3  Major  Points  Originated  by  the  Research 

This  study  differs  substantially  from  previous  workload  character¬ 
ization  studies.  These  differences  are  in  the  following  areas: 

(a)  This  study  proposes  a  complete  statistical  methodology  which 
can  be  used  to  construct  test  workloads.  Previous  studies  were  gen¬ 
erally  restricted  to  a  portion  of  the  problem. 

(b)  This  study  separated  the  workload  characterization  problem 
for  management  oriented  studies  from  that  of  constructing  test  work¬ 
loads.  Workload  subsets  selected  at  random  from  a  computer  workload 
are  not  likely  to  be  applicable  to  the  test  workload  construction 
problem. 

(c)  This  study  examined  the  intercorrelations  among  the  descrip¬ 
tor  variables  and  their  effects  on  the  clustering  phase  of  the  analysis. 
Previous  studies  have  largely  ignored  this  problem. 

(d)  Principal  components  analysis  was  used  to  reduce  the 
dimensionality  of  the  descriptor  space.  This  is  believed  to  be  the 
first  application  of  this  technique  to  the  workload  problem,  although 
one  report  [80]  suggested  its  possible  utility.  Previous  attempts 

at  reducing  the  dimensionality  of  the  descriptor  space  have  been 
inconclusive  and  self-defeating. 

(e)  Various  clustering  algorithms  and  weighting  schemes  were 
compared  in  this  research  as  they  apply  to  the  workload  problem. 
Previous  studies  seemed  to  rely  upon  a  given  scheme  with  little  moti¬ 
vation  for  its  use. 

(f)  A  general  purpose  synthetic  job  for  use  with  batch  workloads 
was  developed.  By  varying  the  parameter  settings,  this  job  can  perform 


as  an  I/O  bound  job,  a  compute  bound  job,  or  a  balanced  job.  It 
includes  the  facility  for  tape,  disk,  and  unit  record  I/O  in  a  some¬ 
what  arbitrary  proportion. 

(h)  The  appropriate  parameter  settings  for  the  synthetic  jobs 
were  determined  from  predictor  equations  obtained  through  regression 
analysis.  Statistical  experimental  design  techniques  were  used  to 
guide  the  collection  of  data,  and  to  allow  testing  of  the  significance 
of  the  effect  that  various  parameters  have  on  resource  demands.  As  far 
as  can  be  determined,  these  techniques  have  not  previously  been  applied 
to  this  problem  although  they  are  routinely  applied  in  other  areas. 

7.4  Suggested  Areas  for  Future  Research 

The  methodology  which  has  emerged  from  this  research  has  not 
been  subjected  to  the  test  of  time.  The  case  study  of  chapter  VI 
demonstrated  the  usefulness  of  many  of  the  procedures  employed,  how¬ 
ever,  they  need  to  be  applied  to  other  sets  of  data  at  other  installa¬ 
tions  to  gain  a  degree  of  acceptance.  A  complete,  ready  to  run  synthe¬ 
tic  benchmark  was  not  produced  in  the  case  study  due  to  a  need  to 
limit  its  scope.  This  needs  to  be  done  so  that  the  calibration/valida¬ 
tion  phase  of  the  procedure  can  be  more  clearly  defined. 

The  "best"  clustering  algorithm  found  for  this  study  is  a  non- 
hierarchical  clustering  algorithm  which  requires  the  analyst  to  decide 
how  many  clusters  to  form.  This  decision  is  somewhat  subjective,  and 
is  certainly  data  dependent.  There  is  the  need  for  a  clustering 
algorithm  which  removes  the  burden  of  this  decision  form  the  analyst. 
This  is  particularly  critical  if  the  procedure  is  to  be  automated. 


Development  of  such  an  algorithm  would  minimize  the  degree  of  human 
intervention  in  the  generation  process. 

7.5  Conclusions 

There  is  the  need  for  the  construction  of  test  workloads  for  use 
in  computer  performance  evaluation  studies.  This  research  has  produced 
a  statistical  methodology  which  should  prove  useful  in  this  construction 
process.  The  feasibility  of  the  major  portions  of  this  methodology 
was  demonstrated  with  a  detailed  case  study  of  the  Amdahl  470/V6  at 
Texas  A&M  University. 

As  with  any  statistical  procedure,  there  are  certain  precautions 
which  must  go  along  with  the  proposed  methodology.  Two  major  elements, 
principal  components  analysis  and  clustering,  have  been  the  subject 
of  widespread  misuse  in  the  past  [7].  The  problem  basically  comes  from 
attaching  "truth"  to  the  results  obtained  from  these  purely  mechanical 
procedures.  The  results  of  principal  components  analysis  are  scale 
dependent;  the  results  of  clustering  are  dependent  upon  the  distance 
metric  and  weighting  scheme  used.  Both,  however,  can  prove  to  be 
effective  tools  if  used  in  a  sound  manner  [7]. 
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APPENDIX  A 

This  appendix  describes  a  computational  procedure  for  representing 
an  mxn  data  matrix  X  in  terms  of  its  principal  components.  This  pro¬ 
cedure  was  utilized  to  express  the  scaled  resource  demand  matrix  in 
terms  of  uncorrelated  variables  to  preclude  the  biasing  of  clustering 
results,  and  to  allow  a  reduction-  in  the  dimensionality  of  the  data 
matrix  as  a  prelude  to  clustering. 

A 

Let  X  =  (X..)  be  an  mxn  data  matrix,  where  X..  represents  the 

■  J  1  J 

1.L  ■  L- 

value  of  the  j—  variable  for  the  i—  data  unit.  Since,  at  least  in  the 
workload  characterization  problem,  the  variables  are  expressed  in 
widely  differing  units,  the  data  must  be  scaled  to  commensurable 

A 

ranges.  Assume  that  the  elements  of  X  have  been  standardized  so  that 
each  variable  represented  has  mean  0,  variance  1. 

J 

The  variance-covariance  matrix  for  the  scaled  data  matrix  X  is 
given  by 

*  '*TJ> 

S  =  =  f S ^ ^ .  Since  the  elements  of  X 

m  i  j 

A 

were  standardized,  S  is  the  correlation  matrix  of  the  original  variables 
in  X. 

Now,  define  a  new  variable  Y-|  as 

n 

Y,  =  T  B.X.  f  where  the  X.,  i = 1 ,  ....  n, 

1  i=l  1  1  1 

are  the  original  variables,  and  ,  i=l,  ...,  n,  are  coefficients  to  be 

A 

determined.  The  row  vector  of  coefficients  B  could  be  defined  in  a 
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number  of  ways,  however  the  principal  component  solution  requires  that 
the  variance  of  Y1  be  maximal  [7].  If  the  data  matrix  is  evaluated  in 

*  T 

terms  of  the  new  variable  Y, ,  the  column  vector  Y,  =  (Y, , Y01 Y,, . . . Y  .) 

^  j jt  I  111  21  31  ml 

given  by  Y,  =  XB  would  result.  Then 

*  T* 

Y  V  •*,*T-kJkT 

Var(Y1)=^=^1-=BSBT. 


By  choosing  the  elements  of  B  large,  Var(Y-j)  could  be  made  as  large  as 

*•»!  ■» 

desired.  Generally,  the  convention  that  BB  =  1  (i.e.  B  is  of  unit 
length)  is  adopted.  This  constraint  can  be  linked  to  the  objective 

A 

function  using  a  Lagrange  multiplier  y.  Then,  a  value  for  B  which 
yields  maximal  variance  for  Y.|  is  found  by  differentiating  with  respect 

A 

to  B  and  setting  this  derivative  equal  to  zero.  Thus 


rl  ■‘■‘■‘T  A-kT  ■k-*T  aT 

[BSB  +  y(l  -  BB1)]  =  2SB  -  2uB 1  =  0 
dB 

A 

To  yield  maximal  variance  for  Y-j,  one  must  choose  the  vector  B  to 
■*  •*  -*t 

satisfy  [S  -  yl]  B  =0.  This  is  an  ordinary  eigenproblem.  Then, 

•»T  1 

the  vector  B  is  one  of  the  eigenvectors  of  the  matrix  S.  It  is 

*T 

easily  shown  [7]  that  in  this  case,  B  is  the  eigenvector  corresponding 

A 

to  the  largest  eigenvalue  A-j  of  S.  The  variable  Y-j  thus  selected  is 

A 

called  the  first  principal  component  of  X. 

Using  a  procedure  similar  to  the  above,  it  can  be  shown  that  the 

A 

second  principal  component  of  X  is  produced  using  the  eigenvector 

*T 

(selected  orthogonal  to  B  above)  corresponding  to  the  second  largest 

*  th 

eigenvalue  of  S.  Likewise,  the  third,  fourth,  ....  n—  principal 

components  are  obtained  using  eigenvectors  associated  with  the  third, 

th  * 

fourth,  ...,  n—  largest  eigenvalues  of  S. 


Once  all  principal  components  have  been  determined,  a  matrix  of 
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principal  component  scores  can  be  computed  by  the  matrix  equation 
Y  =  XP,  where  Y  is  the  matrix-  of  component  scores,  X  is  the  standardized 

A 

scores,  and  P  is  a  matrix  of  coefficients  formed  by  placing  each  of  the 

* 

eigenvectors  determined  above  as  a  column  in  P  (the  vector  corresponding 
to  the  first  principal  component  is  the  first  column,  etc.). 

Since  calculation  of  principal  components  is  basically  an  eigen- 
problem,  it  is  easily  attacked  using  standard  matrix  manipulation  soft¬ 
ware  available  at  most  computer  installations.  The  facilities  provided 
by  the  Statistical  Analysis  System  [11]  were  used  to  isolate  the 
principal  components  and  compute  the  component  scores  for  the  workload 
data  analyzed  in  Chapter  VI. 


APPENDIX  B 


1 


This  appendix  details  the  clustering  algorithm  used  in  summarizing 
the  real  workload  subset.  The  algorithm  is  the  convergent  k-means 
approach  discussed  by  Anderberg  [7],  and  the  program  developed  is 
modeled  after  source  listings  contained  in  that  reference. 

The  convergent  k-means  approach  involves  three  basic  steps  [7]. 
These  are: 

(a)  Begin  with  an  initial  partition  of  the  data  units  into 
clusters.  This  initial  partition  can  be  arrived  at  in  a  variety  of 
ways.  One  way  is  to  select  k  of  the  data  units  as  cluster  centroids. 
These  k  units  can  be  selected  at  random,  the  first  k  units  of  the  data 
set  used,  or  some  other  technique  employed.  The  remainder  of  the  data 
units  are  then  assigned  to  the  "nearest"  cluster,  with  the  cluster 
centroid  remaining  fixed  throughout  the  initial  pass  through  the  data. 
Once  all  data  units  are  assigned  to  a  cluster,  the  centroid  vectors 
are  updated  to  reflect  the  current  cluster  memberships. 

(b)  Take  each  data  unit  in  sequence,  compute  the  distances  to 
all  cluster  centroids,  and  reallocate  the  data  unit  if  its  parent 
cluster  is  not  the  "nearest"  cluster.  In  the  event  of  reallocation, 
the  centroids  of  both  the  gaining  and  losing  clusters  are  updated. 

(c)  Repeat  step  (b)  until  a  full  pass  is  made  through  the  data 


set  with  no  reallocation  of  data  units  among  clusters. 
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The  convergent  k-ineans  algorithm  described  by  Anderberg  and 
implemented  for  this  study  consists  of  a  main  program  (driver)  and  five 
subprograms.  The  logical  relations  among  the  elements  are  depicted  in 
figure  B. 1 . 

Fig.  B.l  Logical  Program  Linkages 


The  main  program  (DRIVER)  simply  assigns  main  storage,  and  then 
invokes  subroutine  EXEC.  This  subroutine  checks  that  sufficient  main 
storage  has  been  requested  and  then  invokes  subroutines  KMEAN  and 
RESULT  in  turn.  Subroutine  KMEAN  is  the  heart  of  the  algorithm.  The 
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data  units  are  read  in,  standardized  and  expressed  in  terms  of  principal 
component  scores  through  repeated  calls  to  subroutine  USER.  Clustering 
is  then  accomplished  with  distance  measures  between  data  units  and 
cluster  centroids  obtained  through  invocation  of  function  DIST.  Once 
clustering  is  achieved,  subroutine  RESULT  is  called  to  output  the 
results. 

There  are  a  number  of  decisions  which  must  be  made  by  the  analyst 
prior  to  using  this  algorithm.  These  include  how  the  initial  partition 
is  arrived  at,  how  many  clusters  are  formed,  and  what  measure  of 
distance  is  used. 

For  this  study,  the  first  k  data  units  were  used  as  the  "seeds" 
of  the  algorithm.  This  choice  was  made  for  lack  of  a  decidedly  better 
alternative.  Some  experimentation  with  other  techniques  was  done, 
however,  the  results  did  not  consistently  favor  one  over  the  other. 

Thus,  the  easiest  and  most  straightforward  approach  was  taken. 

The  particular  implementation  of  the  clustering  algorithm  used 
in  this  study  is  shown  in  the  following  source  listings.  The  listings 
contain  liberal  comments  on  the  different  logical  stages,  rendering 
further  explanation  unnecessary. 


Fig.  B.2  Clustering  Algorithm  -  Program  Listings 
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Fig.  C.2  continued 
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APPENDIX  C 

This  appendix  describes  data  collection  using  the  IBM  System 
Management  Facility  (SMF)  as  it  is  implemented  on  the  Amdahl  470/V6 
at  Texas  A&M  University.  The  basic  flow  of  information  to  SMF  is 
described,  and  the  particular  SMF  records  which  were  used  in  this 
study  are  detailed. 

SMF  is  an  optional  feature  of  the  IBM  System  360/370  operating 
systems  that  can  be  selected  at  system  generation  (SYSGEN)  time.  SMF 
collects  system,  job  management,  and  data  management  information,  and 
can  be  linked  to  user-written  routines  which  can  monitor  the  opera¬ 
tion  of  jobs  or  job  steps  [47].  The  information  is  collected  for  use 
by  management  and  systems  analysts  in  bi 11 ing  customers  or  evaluating 
system  usage. 

There  is  a  variety  of  types  of  information  collected  by  SMF.  They 
i  ncl ude 

(a)  accounting  information  such  as  CPU  time  and  device  and  storage 
utilization; 

(b)  data  set  activity  such  as  a  count  of  block  transfer  requests 
(EXCPs)  and  the  particular  user  of  the  data  set; 

(c)  Volume  information  such  as  the  space  available  on  direct 
access  volumes  and  error  statistics  on  tape  volumes; 

(d)  system  use  information  such  as  system  wait  time  and  I/O 
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configuration. 

The  type  of  data  which  is  collected  can  be  modified  by  the  operator 
at  each  initial  program  load  [ IPL] .  For  example,  data  set  activity  is 
not  presently  collected  at  Texas  A&M  University. 

There  are  a  number  of  different  records  written  by  SMF.  The 
original  manual  [47]  listed  thirty-one  such  records.  Depending  upon  the 
system  configuration,  some  additional  records  may  be  added.  For  exam¬ 
ple,  a  HASP  purge  record  reflecting  each  job's  characteristics  as 
viewed  by  the  spooling  program  and  a  record  monitoring  the  activity 
of  the  WYL6UR/370  system  have  been  added  to  the  collection  of  SMF 
records  used  at  Texas  A&M  University. 

The  various  SMF  records  are  written  to  the  primary  SMF  data 
set  (SYS1 .MANX)  at  critical  points  in  the  lifetime  of  a  job.  For 
example,  the  job  termination  record  is  written  whenever  the  job  is 
terminated  either  normally  or  abnormally,  and  data  set  information  is 
recorded  whenever  a  data  set  opened  by  a  user  program  is  scratched, 
renamed,  closed  or  processed  by  end-of-vol ume  (EOV).  If  SYS1.MANX 
is  defined  on  a  direct-access  device,  as  it  is  at  Texas  A&M  University, 
an  additional  SMF  data  set,  SYS1.MANY,  is  also  defined.  Data  is 
recorded  on  SYS1.MANX  until  its  defined  extent  is  reached.  At  that 
time,  recording  is  switched  to  SYS1.MANY,  and  SYS1.MANX  is  copied  to 
a  dump  data  set  (magnetic  tape).  Periodically,  the  dump  data  ets 
are  merged  to  provide  a  complete  record  of  system  activity  over  some 
period  of  time  (i.e.  one  month). 

The  monthly  SMF  files  provide  a  rich  source  of  resource  demand 
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information  which  can  be  used  for  workload  characterization  studies. 
The  particular  SHF  records  which  were  used  in  this  study  are  detailed 
in  the  following  tables.  They  are  included  here  not  only  for  complete- 
ness,  but  also  to  point  out  the  wide  range  of  workload  descriptors 

which  is  available,  at  least  on  IBM  compatible  eguipment,  without 
recourse  to  monitor  data. 
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Table  C.l  Type  4  (Step  End)  Record  [47] 


Decimal  Displacement _ Field  Size _ Contents 


0 

1 

Reserved  (zero) 

1 

1 

Record  type  (4) 

2 

4 

Time  of  end  of  step 

6 

4 

Date  of  end  of  step 

10 

2 

System  identification 

12 

2 

System  model  identifier 

14 

8 

Job  name 

22 

4 

Reader  start  time 

26 

4 

Reader  start  date 

30 

8 

User  identification 

38 

1 

Step  number 

39 

4 

Step  initiation  time 

43 

4 

Step  initiation  date 

47 

4 

Number  of  card  image 
records  in  input  data  set 

51 

2 

Step  completion  code 

53 

1 

Step  priority 

54 

8 

Program  name 

62 

8 

Name  of  executed  step 

70 

2 

Region  size  in  heirarchyO 

72 

2 

Region  size  in  heirarchyl 

74 

4 

Storage  used  in  heirarchyO 

78 

4 

Storage  used  in  heirarchyl 

82 

1 

Storage  protect  key 

83 

3 

Reserved 

86 

4 

Device  allocation  time 

90 

4 

Problem  program  load  time 

94 

8 

Reserved 

*102 

variable 

Devices  used  by  step 

variable 

1 

Total  length  of  next  fields 

variable 

3 

Step  CPU  time 

variable 

1 

No.  of  accounting  fields 

variable 

variable 

Accounting  fields 

*  -  Bytes  0  and  1  contain  the  length  of  the  field.  For  each 
assigned  device  there  is  an  eight  byte  field  giving  the  device  class, 
unit  type,  channel  and  unit  address,  and  a  count  of  the  EXCPs  issued 
for  the  device. 
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Table  C.2  Type  5  (Job  Termination)  Record  [47] 


Decimal  Displacement 

Field  Size 

Contents 

0 

1 

Reserved  (zero) 

1 

1 

Record  type  (5) 

2 

4 

Time  of  end  of  job 

6 

4 

Date  of  end  of  job 

10 

2 

System  identification 

12 

2 

System  model  identifier 

14 

8 

Job  name 

22 

4 

Reader  start  time 

26 

4 

Reader  start  date 

30 

8 

User  identification 

38 

1 

Number  of  steps  in  job 

39 

4 

Job  initation  time 

43 

4 

Job  initiation  date 

47 

4 

Number  of  card  images 
in  input  data  set 

51 

2 

Job  completion  code 

53 

1 

Job  priority 

54 

4 

Reader  stop  time 

58 

4 

Reader  stop  date 

62 

1 

Job  termination  indicator 

63 

5 

Output  class  indicator 

68 

1 

Checkpoint/restart 

indicator 

69 

1 

Reader  device  class 

70 

1 

Reader  unit  type 

71 

1 

Job  input  class 

72 

1 

Storage  protect  key 

73 

19 

Reserved 

92 

1 

Length  of  rest  of  record 

93 

20 

Programmer's  name 

113 

3 

CPU  time  for  job 

116 

1 

Number  of  accounti ng  f  i el  ds 

117 

variable 

Accounting  fields 
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Table  C.3  Type  26  (HASP  Purge)  Record  [47] 


Decimal  Displacement 


Field  Size 


Contents 


0 

1 

Reserved  (zero) 

1 

1 

Record  type  (26) 

2 

4 

Time  record  copied 

6 

4 

Date  record  copied 

10 

2 

System  identification 

12 

2 

System  model  identification 

14 

8 

Job  name 

22 

4 

Reader  start  time 

26 

4 

Reader  start  date 

30 

8 

User  identification 

38 

4 

Reserved 

42 

2 

Subsystem  identification  (2) 

44 

2 

Section  indicator 

46 

2 

Descriptor  section  length 

48 

3 

Reserved 

51 

1 

Job  information 

52 

4 

HASP  assigned  job  number 

56 

8 

Job  name 

64 

20 

Programmer's  name 

84 

1 

Message  class 

85 

1 

Job  class 

86 

2 

Execution  selection  priority 

88 

2 

Output  selection  priority 

90 

2 

Input  route  code 

92 

8 

Logical  input  device  name 

100 

4 

Programmer's  account  number 

104 

4 

Programmer's  box  number 

108 

4 

Estimated  execution  time 

112 

4 

Estimated  output  lines 

116 

4 

Estimated  punched  output 

120 

4 

Default  output  form  number 

124 

2 

Print  copy  count 

126 

2 

Lines  per  page 

128 

2 

Print  route  code 

130 

2 

Punch  route  code 

132 

2 

Events  section  length 

134 

2 

Reserved 

136 

4 

Reader  stop  time 

140 

4 

Reader  stop  date 

144 

16 

Reserved 

160 

4 

Execution  start  time 

164 

4 

Execution  start  date 

168 

4 

Execution  stop  time 

172 

4 

Execution  stop  date 

176 

4 

Output  start  time 
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Table  C.3  continued 


Decimal  Displacement 


Field  Size 


Contents 


180 

4 

Output  start  date 

184 

4 

Output  stop  time 

188 

4 

Output  stop  date 

192 

2 

Actuals  section  length 

194 

2 

Reserved 

196 

4 

Number  of  input  cards 

200 

4 

Generated  output  lines 

204 

4 

Generated  punched  output 

208 

4 

Reserved 

212 

4 

Printed  lines 

216 

4 

Printed  pages 

220 

4 

Punched  cards 

224 

2 

Accounting  identification 

226 

1 

Job  execution  level 

227 

1 

Local  flags 

228 

2 

Region  in  64K  units 

230 

1 

Max  disc  requests  in  any  step 

231 

1 

Max  tape  7  requests  in  any  step 

232 

1 

Max  tape  9  requests  in  any  step 

233 

1 

Customer  group  data 

234 

4 

Job  selection  priority 

238 

4 

Accumulated  customer  time 

242 

4 

Estimated  1/0  time 

246 

1 

Print  train  mounts 

247 

1 

Forms  mounts 

248 

1 
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2 

Cancel  rerun  explanations 

273 

4 
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285 
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Reserved 
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APPENDIX  D 

This  appendix  describes  the  two  synthetic  jobs  designed  as  a  part 
of  this  study.  The  first  job  was  developed  to  emulate  the  resource 
demands  of  Autobatch  cluster  4  (table  6 . 5( p .91 ) ) .  The  resource  descrip¬ 
tor  set  used  to  characterize  the  demands  of  Autobatch  jobs  contained 
only  three  elements  hence  the  synthetic  job  is  quite  simple.  The  second 
job  was  designed  to  emulate  the  resource  demands  of  Batch  cluster  6  (ta¬ 
ble  6 . 10(p . 103) ) .  The  expanded  resource  descriptor  set  used  to  charac¬ 
terize  the  Batch  jobs  necessitates  a  more  complex  synthetic  job. 

The  synthetic  job  designed  for  Autobatch  cluster  4  is  designed 
to  allow  the  user  to  specify  indirectly  the  number  of  lines  printed  and 
the  total  CPU  time  used  by  setting  two  parameters:  NRLIN  and  NITER. 

The  appropriate  settings  for  these  parameters  may  be  determined  using 
predictor  equations  established  in  section  6.7.  A  loop  control 
parameter  LIMIT  =  Maximum  (NRLIN,  NITER}  is  first  calculated.  The  main 
loop  is  then  executed  a  total  of  LIMIT  times.  The  first  NRLIN  times 
through  the  loop,  an  output  line  is  produced.  Other  actions  accom¬ 
plished  each  time  through  the  loop  include  calculating  two  pseudo¬ 
random  numbers  using  a  multiplicative  congruential  scheme  and  perform¬ 
ing  some  simple  calculations  on  the  second  of  these  two  generated 
numbers.  The  particular  implementation  of  the  job  used  in  this  study 
(WATFIV)  is  shown  in  figure  D.l. 


The  synthetic  job  designed  for  Batch  cluster  6  is  somewhat  more 
complex  than  the  one  designed  for  Autobatch  cluster  4.  Four  parameters: 
NITER,  NOUT,  NTAP,  and  NDIS  are  specified  to  control  the  resource 
usage.  NITER  controls  the  number  of  times  the  "compute"  loop  is 
executed,  NOUT  controls  how  many  lines  of  output  are  produced,  NTAP 
controls  how  many  records  are  read  from  a  tape  file,  and  NDIS  controls 
how  many  records  are  read  from  a  disk  file. 

The  first  task  accomplished  is  to  establish  the  loop  control 
parameter  LIMIT  =  Maximum  {NITER,  NOUT,  NTAP,  NDIS).  Within  the  main 
loop  a  pseudo-random  number  is  produced.  In  addition,  the  first  NOUT 
times  through  the  loop  a  line  is  output;  the  first  NTAP  times  through 
the  loop  a  record  is  read  from  the  tape  file;  the  first  NDIS  times 
through  the  loop  a  record  is  read  from  the  disk  file;  and  the  first 
NITER  times  through  the  loop  a  compute  routine  is  invoked.  The  compute 
routine  involves  filling  two  5x5  matrices  with  random  numbers  and  then 
calling  a  routine  to  multiply  the  two  matrices  to  form  a  third  5x5 
product  matrix.  The  appropriate  settings  for  the  parameters  to  produce 
a  given  demand  pattern  can  be  determined  from  predictor  equations 
established  in  section  6.7.  The  particular  implementation  of  the  job 
used  in  this  study  (PL/I)  is  shown  in  figure  D.2. 


Fig.  D.l  Program  Listing  for  Autobatch  Synthetic  Job 
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